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The achievement motive construct and its measurement: 
Where are we now? 105 


Stephen Fineman* (003 





The achievement motive (nAch) is a construct which has attracted many applied researchers. The present 
paper reviews the nature and conceptualization of nAch, focusing on the considerable number of different 
projective and questionnaire instruments which have been evolved for its measurement. The instruments are 
examined for their convergent validity, which is found to be very poor. Reasons for this are explained in 
terms of the psychometric characteristics of the instruments and whether nAch should be conceptualized as 
a conscious or unconscious variable. Guidelines for the future development of nAch measures are presented. 





The purpose of this paper is to examine the current position regarding the conceptualization and 
measurement of the achievement motive. There follows initially a description of the concept 
after which the various measures are reviewed and their psychometric properties evaluated. 
Reasons for the lack of association between different techniques are discussed, and finally 
suggestions are made for future developments in the field. Í 


The achievement motive construct 


We can go back over 80 years to the psychological writings of William James to find mention of 
the notion and importance of achievement strivings. James (1890, pp. 310-311) talks of man’s 
self-regard as being determined by self-imposed goals, the achievement of which leads to feeling 
of wellbeing and elevation, while failure brings about frustrations and humiliation. Twenty years 
later, in Germany, Narziss Ach (1910) was utilizing the concept of ‘determining tendency’ to 
explain the achievement-related behaviour of his laboratory subjects. But the formalization of the 
achievement motive construct derives primarily from the work and theory of Murray (1938). 
Murray and his colleagues conducted an in-depth ‘multiform’ study of 50 young men; from this 
emerged a taxonomy of personality needs, defined as hypothetical constructs reflecting 
physiological ‘forces’ which direct behaviour. One of these is the need for achievement 
(abbreviated ‘nAch’). This motive was defined as follows: ‘.. ‚the desire or tendency to do 
things as rapidly and/or as well as possible. [It also includes the desire] to accomplish something 
difficult. To master, manipulate and organise physical objects, human beings or ideas. To do this 
as rapidly and independently as possible. To overcome obstacles and attain a high standard. To 
excel one’s self. To rival and surpass others. To increase self-regard by the successful exercise 
of talent’ (Murray, 1938, p. 164). U 

Murray's interest in need systems heavily influenced David McClelland (1951, 1955) in his own 
motivational theory. As with William James, McClelland focuses upon the affect associated with 
the achievement motive, defining nAch as the positive or negative affect aroused in situations 
that involve competition with a standard of excellence, where performance in such situations can 
be evaluated as successful or unsuccessful (McClelland, Atkinson, Clark & Lowell, 1953, 
pp. 75-81). McClelland's formulations are fairly firmly rooted in the psychoanalytic school of 
motivation, an orientation which again attests to the influence of Murray, and also of Freud. 
McClelland’s early work was largely devoted to developing a general theory of achievement 
motivation (McClelland et al. 1953), but in more recent years he has become concerned with 


* This research was undertaken while the writer was a member of the Medical Research Council's Social 
and Applied Psychology Unit, Sheffield University. 


2 Stephen Fineman 


applying the theory to problems of economic growth, and also specific issues of entrepreneurial 
and managerial behaviour (McClelland, 1961; McClelland & Winter, 1969). 

The essence of the Murray-McClelland definitions of the achievement motive are central to 
those of other major writers in this field. Atkinson (1957, 1964) (one of McClelland's early 
colleagues) talks of nAch in terms of the capacity for taking pride in accomplishment. His 
analysis of the antecedents to achievement behaviour focuses upon not just the motivation to 
achieve, but also on the motivation to avoid failure. Together, these motivational tendencies 
determine whether a person will ultimately approach or avoid an achievement task. Each of the 
motives themselves is seen as a function of two situational variables — the perceived expectancy 
of success, and the incentive value of the task activity. Heckhausen (1967) also develops a 
two-motive theory of achievement behaviour. His motives are termed ‘hope of success’ and 
“fear of failure’. Hope of success is defined as ‘the striving to increase or keep as high as 
possible, one's own capability in all activities in which a standard of excellence is thought to 
apply and where the execution of such activities can, therefore, either succeed or fail' 
(Heckhausen, 1967, pp. 4-5). 

Intuitively, the achievement motive concept appears very plausible. It seems to account for a 
particular type of commonly observed behaviour - striving to do well, desiring to fully utilize 
one's capabilities to succeed and to be judged by oneself and others on this success. It is 
therefore understandable that it should have held the attention of applied researchers interested 
in organizations where achievement goals are explicit — such as in schools, universities and 
industrial concerns. But while there may be some variability in describing nAch, there is far 
more disagreement about its measurement. This issue will be examined next. 


Measuring the achievement motive 


The attractiveness of nAch is indicated by the many different measures which have been 
devised, purportedly tapping the construct. Table 1 lists 22 instruments which the writer has 
been able to trace. These are of three main types: projective instruments; scales within 
comprehensive personality inventories; and specific questionnaire measures of nAch. 


1. Projective instruments 


The McClelland et al. (1953, pp. 98-99) version of Murray,’s (1943) Thematic Apperception Test 
(TAT) is the most commonly used projective measure of nAch, and it is intrinsic to the 
development of McClelland's achievement motivation theory. McClelland argues that the analysis 
of ‘fantasy’ is the best approach to nAch measurement, and this is performed by content 
analysing subjects' written stories to (usually) four picture cards designed to elicit achievement 
themes. The scenes depicted are: two men (‘inventors’) in a shop working at a machine; a boy in 

a checked shirt at the desk, an open book in front of him; Murray’s TAT card 7BM ‘father-son’ 
picture; and card 8BM ‘boy and operation scene’. Subjects are presented with each picture for 
20 sec, and after each exposure they are asked four questions: (i) What is happening? Who are 
the persons? (ii) What has led up to this situation? That is, what has happened in the past? (iii) What 
is being thought? What is wanted? By whom? (iv) What will happen? What will be done? The 
stories are scored according to a content-analysis system described by McClelland et al. (1953) 
and Smith & Feld (1958), and interscorer reliabilities of around 0-80 and 0-90 are usually 
reported (e.g. Atkinson, 1958; Mitchell, 1961). (The reader of the mainstream of the nAch 
literature is very soon impressed by the almost evangelical enthusiasm that is expressed about 
the good coding reliability which can be obtained for the TAT. Clearly this is an important factor, 
showing that the scoring category system is well defined and does relate to what appears on the 
TAT protocols. However, such ‘‘hard’’ statistics should not detract from the fact that this says 
nothing about the internal reliability of the measure, and incidentally, that questionnaire 
measures are likely to be even better in their interscorer agreement.) 


Table 1. List of nAch measures 


Abbreviation 


Full title 


Projective measures 


The achievement motive construct 


Source 


TAT McClelland’s Thematic Apperception Test ` McClelland et al. (1953) 
Heckhausen Heckhausen’s Thematic Apperception Test Heckhausen (1967) 
FTI French Test of Insight French (1958) 
IPIT Iowa Picture Interpretation Test Hurley (1955) 
Graphic Graphic Expression Technique Aronson (1958) 
Tartan Knapp Tartan Test Knapp (1958) 
Comprehensive personality inventories 
EPPS Edwards Personal Preference Schedule Edwards (1959) 
CPI California Psychological Inventory Gough (1957) 
PRF Personality Research Form Jackson (1967) 
SDI Self-Description Inventory Ghiselli (1971) 
ACL Adjective Check List Gough (1960) 
Specific questionnaire measures 
MAS Mehrabian Achievement Scale Mehrabian (1968) 
CAMS Costello's Achievement Motivation Scale Costello (1967) 
LAMQ Lynn's Achievement Motivation Questionnaire Lynn (1969) 
HAMQ Hermans' Achievement Motive Questionnaire Hermans (1970) 
vAch The v Achievement Measure de Charms et al. (1955) 
SCT Mukherjee’s Sentence Completion Test Mukherjee (1965) 
RAMQ Robinson’s Achievement Motivation Questionnaire Argyle & Robinson (1962) 
ARPS The Achievement Risk Preference Scale O’Connor & Atkinson (1962) 
SAS Sherwood Achievement Scale Sherwood (1966) 
AAMI Aberdeen Academic Motivation Inventory Entwistle (1968) 
SAMM Smith’s Achievement Motivation Measure Smith (1973) 








A projective technique closely related to McClelland’s is a set of six pictures devised by 
Heckhausen (1967, 1969). These depict German school and occupational settings featuring 
blue-collar and white-collar situations. Subjects’ stories are keyed for hope of success and fear 
of failure, in a similar fashion to the TAT procedure. 

French (1958) has constructed a ‘Test of Insight’; an nAch measure based on McClelland's 
rationale but more structured than the TAT. Statements such as ‘Joe is always willing to listen’ 
and ‘Bill always lets the other fellow win’, are presented to subjects, and they are instructed to 
write a short account describing the characteristics and motives of the character in the 
statement. Responses are content-analysed for nAch. The Iowa Picture Interpretation Test 
(IPIT) (Hurley, 1955) also attempts to add some structure to the traditional projective method. 
The measure has a multiple choice format, where four alternative choices are offered to ten of 
the Murray (1938) TAT cards. These alternatives are then ranked by subjects according to their 
preference. Each of the four choices represents one of four response classes: achievement 
imagery, insecurity, blandness, and hostility. 

A rather different projective approach has been taken by Aronson (1958), who has developed a 
non-verbal measure of nAch using graphic expression - or doodles — as the testing stimulus and 
response mode. Basic ‘scribble patterns’ are slide-projected to groups and after each slide two 
minutes is allowed for reproduction of the design. The scoring system is based upon empirical 
relationships between the reproductions and the TAT nAch score. The Tartan Test (Knapp, 
1958) is also based upon empirical associations with the TAT, but in this case tartan-preference 
is the correlate. Thirty tartans are ordered by subjects into a forced-distribution of preference. 
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High nAch is said to be indicated by preference for blue in the tartan, while low nAch is 
associated with red preference. 


2. Comprehensive personality inventories 


The Edwards Personal Preference Schedule (EPPS) (Edwards, 1959) is a 225-item inventory 
designed to measure 15 needs, one of which is nAch. Each item is a personally descriptive pair 
of statements which are matched according to their average social desirability ratings, as derived 
from a sample of students. Subjects are required to select the one item in each pair which most 
characterizes him. Another ‘omnibus’ personality inventory is the California Psychological 
Inventory (CPI) (Gough, 1957), a 480-item true-false questionnaire. Two of the 18 scales in the 
inventory are related to nAch — achievement via conformance (CPI,.) and ‘achievement via 
independence (CPI,,)’. Gough’s description of the latter scale appears closest to the nAch 
concept as discussed at the beginning of this paper. 

The Personality Research Form (PRF) (Jackson, 1967) is also a very comprehensive 
instrument and contains, in its shortest form, 300 true-false items. Fifteen different scales are 
scored, one of these being nAch, derived directly from Murray's need theory. 

Ghiselli (1954, 1968, 1971) has developed an empirically based inventory (SDI) consisting of 64 
pairs of personality descriptive adjectives, matched for social desirability. In the first half of the 
test the respondent chooses the adjective in each pair which he considers better describes him, 
and in the second half he chooses the one which he considers less well describes him. Thirteen 
traits are scored, including achievement motivation (this scale title was changed to ' need for 
occupational achievement’ in Ghiselli's 1971 book, but the items and Scoring system remain as 
originally developed). A final inventory is the Adjective Check List (ACL) (Gough, 1955). In this 
instrument 300 adjectives can be selectively self-checked according to their relevance to the 
subject's own behaviour. Heilbrun (1958) has developed 15 need scales for the ACL, one scale 
being based on Murray's nAch concept, comprising 38 adjectives. 


3. Specific questionnaire measures of nAch 


Eleven of the scales in Table 1 have been designed to measure nAch alone. Mehrabian (1968, 
1969) has constructed a male and female version of a 26 item nAch scale (MAS) where the 
extent of agreement or disagreement with items such as 'I worry more about getting a bad grade 
than I think about getting a good grade' are recorded. He has also produced a nine-item 
shortened version of this instrument. Costello (1967) describes two scales which emerged from 
factor analytic studies of responses to yes-no questions in a prototype nAch scale (CAMS). The 
first scale, comprising ten items, is described as "wanting to do a job well'. The other scale, of 
12 items, relates to ‘dispositions of the person who strives to be.a success’. Another factor 
analytically derived scale is that of Lynn (1969) (LAMQ). It is made up of eight yes-no 
questions, such as ‘Do you like getting drunk’, and ‘Do you dislike seeing things wasted’. 
Hermans (1970) has cluster analysed responses of Dutch students to a 92-item prototype nAch 
measure (HAMQ). A 29-item nAch factor was extracted. Items refer to preferences for a variety 
of activities, e.g. ‘shopping’ and ‘being busy’. De Charms, Morrison, Reitman & McClelland 
(1955) have constructed a nine-item measure based upon Murray’s work, which concerns the 
value placed on achievement activities, hence abbreviated as ‘vAch’. Example items are ‘I 
enjoy work as much as play’ and VT set difficult goals for myself which I attempt to reach'. 

A more complex approach has been taken by Mukherjee (1965). His scale, the Sentence 
Completion Test (SCT), is made up of 50 items each with three statements matched for social 
desirability, only one of which scores for nAch. A subject is instructed to choose the one 
statement in each triad which most characterizes him, and the one which least characterizes him. 
Argyle & Robinson (1962) describe a questionnaire nAch measure (RAMQ) resulting from some 
unpublished work of Robinson. Fifteen items define the scale, for example, ‘In how many 
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activities do you wish to do your very best?’ and, ‘How strong is your desire to avoid 
competitive situations?' 

The notion of achievement behaviour being a function of two motives - a motive to approach 
success and a motive to avoid failure - has been embodied in O'Connor & Atkinson's (1962) 
Achievement Risk Preference Scale (ARPS). An index of resultant motivation is obtained from 
responses to items such as ‘I’d rather: (a) like being interviewed for a job, (b) dislike being 
interviewed for a job’ and, ‘I recite in class: (a) less than other students, (b) more than other 
students '. 

The Sherwood Achievement Scale (SAS) (Sherwood, 1966) is a very short self-report 
questionnaire of nAch. There are three items, referring to competitiveness, striving for 
accomplishment, and goal setting, each judged on seven-point rating scales. Entwistle (1968) has 
constructed a 24-item yes-no self-rating inventory (AAMI) designed to assess nAch in an 
academic setting. Examples of items are: ‘Do you like being asked questions in class?’ ‘Do you 
enjoy most lessons’? Finally a measure has recently been presented by Smith (1973). This ‘quick 
measure of achievement motivation' (SAMM) contains 17 true-false items, such as 'I don't 
think I'm a good trier’, and ‘Failure is no sin’. 


Relationships between different nAch measures 


The above 22 instruments all purport to measure the achievement motive; it is therefore 
reasonable to expect high statistical interrelationships between different techniques. Some 
evidence on this has been provided by Weinstein (1969) who examined the published relationships 
between the TAT* and FTI, TAT and Graphic, TAT and EPPS, TAT and vAch, FTI and EPPS, 
.and TAT and SAS. Of 21 separate correlations only two emerged as statistically significant. He 
also reports the results of his own study on the intercorrelations between seven nAch 
instruments - the TAT, FTI, Graphic, ARPS, SAS, EPPS, and CPI (Ac and Ai). The average r 
between the measures was a non-significant 0-04. Table 2 updates Weinstein’s review, and also 
includes published findings on eight other measures - the IPIT, the MAS, the Tartan Test, 
HAMQ, SAMM, ACL, LAMQ, and the SCT. In all, 17 different measures are presented. There 
are no studies relating all of the previously mentioned 22 measures to each other. Furthermore, 
no studies of concurrent relationships can be traced for the Heckhausen projective method 
(although Heckhausen, 1967, p. 8, refers to some unpublished work by Vukovich, Heckhausen 
& von Hatzfeld where ‘a number of relations’ were found between the measure and some 
questionnaire items). 

Of the 78 r’s presented in Table 2 only 22 are statistically significant. In other words, 72 per 
cent of the correlations in the table indicate no significant relationship between pairs of nAch 
measures. The overall median correlation is 0-12. The relationships in Table 2 can be 
summarized according to some of the main types of measures represented. The median r between 
the TAT and questionnaire measures is 0-15, and with other projective methods 0-17. 
Questionnaire techniques are virtually unrelated to each other (median r= 0-10) and the FTI 
shows no correlation with questionnaire measures of nAch (median r= 0-01). Reviewing the status 
of the TAT in 1958 McClelland commented: ‘The conclusion seems inescapable that if the n 
Achievement score is measuring anything, that same thing is not likely to be measured by any 
simple set of choice-type items’ (McClelland, 1958a, p. 38). The current status of the TAT 
certainly appears to reinforce McClelland’s opinion. But we can now go further to state that the 


* Atkinson & Raynor (1974, p. 191) severely criticize Weinstein's (1969) reported 0-77 scorer reliability 
on the TAT. They contend that 0-85 is the lowest permissible for proper use of the instrument. To balance 
this view, though, it should be noted that Weinstein was unusually thorough in some respects. He employed 
two teams, each with two scorers, who used the standard Atkinson (1958) scoring system (most studies 
employ just one scorer who checks his coding ability against that of the examples in the scoring manual). 
Also, the initial classification of the main imagery categories achieved a relatively high agreement between 
the two teams of scorers at 79 per cent. 
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Table 2. Correlations amongst different measures of nAch 











Study Sample Measures Correlation 
Himelstein et al. (1958) T] Air Force Academy males TAT-FTI —0-07 
Hofman (1965) 112 high school males 0-17 
Shaw (1961) 18 high school male achievers 0-25 
20 high school male underachievers 0-09 
Weinstein (1969) 176-179 college males 0:08 
Knapp (1958) 68 college TAT-Tartan 0-18 
Aronson (1958) 26 college males TAT-Graphic — 0:27 
18 college males 0:51* 
Weinstein (1969) 176-179 college males —0-01 
Atkinson and Litwin (1960) 47 college males TAT-EPPS —0-05 
Himelstein et al. (1958) 298 Air Force Academy males 0-00 
Hofman (1965) 112 high school males 0-20 
Marlowe (1959) 44 college males —0-05 
Shaw (1961) 18 high school male achievers 0-12 
20 high school male underachievers —0-03 
Bendig (1959) 244 college (136 male, 108 female) 0-11 
Birney (in Atkinson, 1958a, p. 38) 300 0-00 
Melikian (1958) 69 college (50 males, 19 females) 0-16 
Weinstein (1969) 176-179 college males ? 0-10 
Grant et al. (1967) 148 managers 0-20's* 
Morrison (in de Charms et al. College females TAT-vAch 0-09 
1955, p. 421) 
de Charms et al. (1955) 78 college males 0-23* 
Sherwood (1966) 37 college males TAT-SAS 0-40* 
80 college males 0-42** 
30 college females 0-29 
Weinstein (1969) 176-179 college males 0-07 
Hines (1973) 42 college and church TAT-LAMQ 0-32* 
4 52 college and church 0-35* 
Smith (1973) 89 males TAT-SAMM 0-48** 
Weinstein (1969) 176-179 college males TAT-CPI,, 0-05 
Skolnik (1966) 41 boys 0-01 
4] men 0-28 
43 girls 0-23 
43 women 0-39** 
Weinstein (1969) 176-179 college males TAT-CPI,. 0-07 
Skolnik (1966) 41 boys 0-09 
41 men 0-42** 
43 girls 0-32* 
43 women 0-18 
Hermans (1970) 30 college males TAT-HAMQ 0-13 
31 college males 0-20 
Mehrabian (1968) 108 college males TAT-MAS 0.28** 
109 college females —0-11 
Weinstein (1969) 176-179 college males TAT-ARPS = —-0-14* 


* P<0-01; ** P< 0-05. 


t This is the highest reported r with blue preferences. 
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Table 2 (cont.) 


NN 


Study Sample Measures Correlation 
oe i ee m 
Himelstein et al. (1958) TI Air Force Academy males FTI-EPPS 0-02 
Hofman (1965) 112 high school males 0-17 
Shaw (1961) 18 high school male achievers 0-51* 
20 high school male underachievers 0-26 
Weinstein (1969) 176-179 college males 0-00 
Gough & Heilbrun (1965, p. 22) 100 males CPL,-ACL 0:30** 
CPI,,-ACL -0-01 
Gough & Heilbrun (1965, p. 14) 90 college EPPS-ACL 0-01 
Edwards et al. (1972) 218 college (109 males, 109 females) EPPS-PRF 0.25** 
Mukherjee (1965) 58 college mixed SCT-vAch 0-44** 
Weinstein (1969) 176-179 college males EPPS-CPI,.  -0:12 
Gough (1964, p. 37) 45 males 0-04 
Weinstein (1969) 176-179 college males EPPS-CPI,, 0-01 
Gough (1964, p. 37) 45 males 0-19 
Barnette (1961) 176 college mixed IPIT-CPl,, 0-09 
Mehrabian (1969) 114 college males PRF-MAS 0-62** 
98 college females 0-37** 
Weinstein (1969) 176-179 college males FTI-ARPS —0-08 
FTI-Graphic 0-05 
FTI-SAS 0-10 
FTI-CPla. 0-04 
FTI-CPL, —0-07 


1 Graphic-CPI,, —0-03 
Graphic-ARPS —0-06 

Graphic-SAS —0-05 

EPPS-Graphic 0-16* 

EPPS-ARPS -—-0-04 

EPPS-SAS 0-11 

ARPS-CPI,: 0-23* 

` ARPS-CPI,, 0-17 
ARPS-SAS 0-18* 

SAS-CPlac 0-04 

SAS-CPI,, —0-04 


— ——————————— 


* P«0-01; ** P« 0-05. 


TAT is also measuring something different from other projective techniques, and that there is as 
yet no evidence to suggest that alternative projective measures are tapping the same construct as 
the questionnaire measures. Furthermore, the questionnaire instruments are themselves tending 
to measure different things. The operationalization of nAch therefore seems considerably 
confused. Why should this be so? There appear at least two likely reasons for this. The first 
concerns the psychometric adequacy of the various measures, and the second relates to the 
question of whether nAch should be treated as an unconscious or conscious variable. 


8 Stephen Fineman 


Psychometric adequacy 

(a) Internal consistency 

A primary psychometric requirement of an nAch measure is that its items or parts are 
sufficiently homogeneous to consider it as tapping a unitary construct. Given suitable internal 
consistency one may then proceed to investigate the stability and validity of the measure. 
Several writers recommend reliability coefficients of at least 0-50 or 0-60 as necessary for 
distinguishing between groups on an hypothesized construct (Garrett, 1965, p. 351; Nunnally, 
1967, p. 227), although much higher levels (0-90 and above) are usually required for individual 
diagnosis. 

A number of nAch internal consistency coefficients have been traced in the literature, and 
these are presented in the first column of Table 3. Looking first at the projective instruments 
(the first six in the table) four have reported reliabilities - the TAT, FTI, IPIT, and Graphic. 
Most frequently researched is the TAT, which has a median coefficient of 0-32, making it 
potentially unsuitable for group and individual diagnosis. Considering the reliability coefficient as 
a direct measure of the proportion of test variance which is true variance (Garrett, 1965, p. 346), 
only 32 per cent of the TAT variance is true-score construct variance - the remaining 68 per cent 
is error variance. In other words, we can have little confidence that the TAT is measuring any 
unitary psychological construct, let alone nAch. This would preclude it correlating systematically 
with other nAch measures regardless of their own psychometric properties. Entwistle (1972) has 
also noted the poor internal reliability of the TAT, and has drawn attention to the views of 
Jensen (1959) and Mitchell (1961) which suggest that the TAT is too short to yield sufficient 
construct variance to overcome its error variance. However, there are problems in increasing the 
number of pictures in the TAT - a prime one being that subjects soon begin to tire. Atkinson 
(1954) has shown that the value of nAch scores after four pictures drops considerably. A further 
point is that increasing the time allotted for writing a story has been shown not to influence the 
obtained motive score (Lindzey & Heinemann, 1955). 

The FTT's internal consistency is reported in only a single study as 0-48. While this is an 
improvement on the TAT it is still a little too low to reflect a clear construct. The IPIT and the 
Graphic are also poor in this respect. No internal consistencies could be traced for the 
Heckhausen method or for the Tartan. 

It was not possible to locate internal reliability for all the questionnaire measures in Table 3, 
but where statistics were available they revealed a rather different picture to that of the 
projectives. As can be seen from the table, the coefficients are almost invariable high, suggesting 
sufficient construct purity for the group predictions. Two scales — the LAMQ and vAch - have 
no overall homogeneity estimates, but their item loadings seem sufficient to assume that their 
internal consistencies are good. In summary, one may be reasonably confident that nine 
questionnaire nAch measures - the EPPS, PRF, MAS (short form), CAMS, HAMQ, SCT, ` 
SAMM, LAMQ and vAch - are each measuring some dimension. 


(b) Stability across time 
A measure which has poor internal consistency is very unlikely to be stable over time. This view 
tends to be borne out by the stability coefficients for the projective instruments in the right-hand 
column of Table 3. The median correlation for the TAT is low at 0-32. De Charms (1968) 
comments on this problem: ‘Sufficient it is to say that high scorer reliability is usually 
meticulously adhered to, whereas test-retest reliability is disappointingly low...’ (p. 184). 
Similar sentiments are expressed by Entwistle (1972). Heckhausen's method is somewhat more 
stable than the TAT, although he himself concluded that it has ‘serious disadvantages 
psychometrically speaking’ (Heckhausen, 1967, p. 20). 

McClelland (19584, pp. 18-20) mentions some of the problems of reliability in the TAT — 
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Table 3. Internal consistency and stability of nAch measures 





Achievement 
motive measure Internal consistency’ 


Projective measures 


TAT 0-27, 0-43 (Child et al. 1956)' 
0-54 (Lindzey & Herman, 1955) 
0-28, 0-32, 0-38 (Reitman & Atkinson, 


1958) 
0-27 (Weinstein, 1969) 


0-31? (Scott & Johnson, 1972) 


Heckhausen — 

FTI 0-48 (Weinstein, 1969) 
IPIT 0-34* (Hurley, 1955) 
Graphic 0-21* (Aronson, 1958) 
Tartan — 


Comprehensive personality inventories 
EPPS 0-74 (Edwards, 1959) 


0:59? (Scott & Johnson, 1972) 


CPI, = 

CPI,3 — 

PRF 0-77, 0-77, 0-86 (Jackson, 1967) 
0-734, 0-724 (Jackson, 1967) 

SDP — 

ACL — 


Specific questionnaire measures 
MAS (long form) — 


MAS (short form) 0-76*, Male Scale; 714, Female Scale 


(Mehrabian, 1968) 
CAMS 0-73. 0-82 (Costello, 1967) 
LAMQ 0-36/ (Lynn, 1969) 
HAMQ 0-824 (Hermans, 1970) 
vAch . 0-30 (de Charms et al. 1955) 
SCT 0-724 (Mukherjee, 1965) 

Specific questionnaire measures 

RAMQ = 
ARPS — 
SAS — 
AAMP — 
SAMM 0-56 (Smith, 1973) 


Stability across time? 


0-22* (2 weeks) (McClelland, 1955) 

0-26 (9 weeks) (Krumboitz & Farquhar, 1957) 

0-16*^, 0-22^, 0-32^ (3 years), (Kagan & Moss, 
1959) 

0-44^ (2 weeks) (Lowell, in Kagan & Moss, 
1959) 

0-34', 0-36', 0-31’ (10 years) (Moss & Kagan, 
1961) 

0-40 to 0-60 (5 weeks) (Heckhausen, 1967, 
p. 20) 

—0-06?, 0-17*, 0-45 (5 months) (French, 1955) 

0-36 (7 weeks) (Himmelstein & Kimbrough, 
1960) 

0-52! (6 weeks) (Hurley, 1955) 

0-36 (1 week) (Weinstein, 1969) 


0-74 (1 week) (Edwards, 1959) 


0-73, 0-60 (1 year); 0-79 (7-21 days) (Gough, 
1964) 

0-57, 0-63 (1 year); 0-71 (7-21 days) (Gough, 
1964) 

0-80 (1 week) (Jackson, 1967) 


0-81, 0-74 (10 weeks) (Gough & Heilbrun, 
1965) 

0-60 (6 weeks) (Gough & Heilbrun, 1965) 

0-52 (514 weeks) (Gough & Heilbrun, 1965) 


0-78 Male Scale, 0-71 Female Scale (10 
weeks) (Mehrabian, 1968) 

0-71 (2 months), 0-75 (6 weeks), 0-83 (3 
months) (Mukherjee, 1965) 


0-83 (244 months) (Entwistle, 1968) 








! Significant split-half r, unless otherwise noted. 

2 Significant test-retest r, unless otherwise noted. 

3 This instrument is empirically constructed, a strategy which depends on the relationship with a criterion as 
demonstration of item consistency rather than on the internal characteristics of the measure. 

* Not statistically significant. * Coefficient alpha. 

© r between five-choice subgroups. 4 Kuder-Richardson Formula 20. 

* Average item-whole. f Median item loading on a single factor. 
* All items loading at least 0-30 point-biserial correlation. ^ Phi coefficient. 

t Contingency coefficient. 1 Rank-order correlation. 

* Average itern-item. 
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particularly its sensitivity to the conditions of testing. Because it is so hard to replicate testing 
conditions McClelland recommends that high test-retest reliability should not be insisted upon. 
Instead the stability of motivational dispositions (i.e. the ‘real’ rather than the apparent stability) 
should be inferred indirectly from validity. This is an interesting argument. Jensen (1959) notes 
that if a test does have a demonstrable validity it indeed must also be reliable. But it is generally 
validity which is much more difficult to demonstrate than reliability (because of sampling and 
criterion problems). Hence the sequence of test construction should commence with reliability, 
and this will then determine the upper limit of the validity. It may be, though, that the 
peculiarities of projective tests do in fact make reliability (retest and internal) determination 
more difficult than validity considerations: hence inferring reliability from validity could be an 
appropriate procedure. The onus of proof thus falls heavily upon the validity studies. We will 
examine some of these shortly. 

The stability across time of the more structured projective methods is also largely 
disappointing. The FTI is very poor, two of the four correlations being non-significant. The 
one-week stability of the Graphic technique is low; the IPIT is rather better. No data are 
available for the Tartan. 

Seven questionnaire measures of nAch have reported stabilities, and they are all good. Most 
impressive is the CPI which is stable up to a period of one year. However, the stability of over 
half the questionnaire measures is unknown, and it is unfortunate that several of the available 
test-retest reliabilities are over short periods. This is particularly serious when the gap between 
successive administrations is as short as one week (e.g. Edwards, 1959; Jackson, 1967) as recall 
factors can then have their greatest influence on the retest score. 


(c) Validity 

It has been previously argued that the poor internal consistency and stability of the projective 
instruments - particularly the TAT -is itself a sufficient explanation of the lack of relationships 
with other nAch measures. Such reasoning follows conventional psychometric theory, and it is 
in this vein that Entwistle (1972) comments: ‘The reader may find it perplexing that need 
achievement, a measure that seems to have low reliability, is purported to predict school 
grades or performance at tasks like anagrams’ (p. 386). Let us examine this point through 
existing validity studies. 

The most direct (and most common) type of validity study has been to seek a relationship 
within a performance criterion; a strategy which is consistent with the postulate of McClelland et 
al. (1953, p. 80) that there should be ‘.. .a significantly positive but moderate correlation 
between n Achievement and the actual efficiency of performance of various sorts '. 

Klinger (1966) shows that of 59 reported relationships between TAT, nAch and performance 
(e.g. course grades, grade averages, long-term behaviour patterns, performance at specific tasks) 
28 are statistically significant and 31 are non-significant. Furthermore, Entwistle (1972) argues 
that the positive studies noted by Klinger could represent occasions when the relationships 
between motive scores and grades are accounted for by the intelligence level of the respondent. 
Other studies interpretable in terms of the criterion validity of the TAT concern the prediction of 
managerial or executive performance. McClelland (1961, pp. 267—71) describes six investigations 
of this sort. Three of these showed positive relationships between nAch and performance (as 
predicted), two revealed no relationships, and one indicated equivocal results. Four other 
supportive studies can be traced in the literature (Schrage, 1965; Cummin, 1967; Grant, Katovsky 
& Bray, 1967; Wainer & Irwin, 1969), and two where no nAch-performance relationship has 
emerged (Harrell, 1969, 1970). 

The validity relationships so far mentioned are based on the assumption that a small direct 
link between nAch and performance should exist. Even though this might be expected at a 
modest level, the theoretical propositions of Atkinson & Feather (1966) have suggested that this 
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view could be an oversimplification of the issue and that performance and nAch will most 
positively relate when an expectancy of satisfying nAch has been aroused, and when 
expectancies of satisfying motives which may confound this relationship (e.g. fear of failure) have 
not been aroused. McClelland (1961, p. 226), has also modified his early thoughts on the 
performance prediction question and has stated that nAch will lead to harder work only when 
there is a chance that personal efforts will make a difference in the outcome. 

It could be, then, that some of the studies which demonstrate a lack of relationships between 
an nAch measure and performance are reflecting inappropriate situational circumstances for 
achievement-motivated performance. Investigations by Atkinson & Reitman (1956), Atkinson & 
Litwin (1960), Reitman (1960), Feather (1961), Andrews (1967), Karabenick & Yousef (1968), 
and Litwin & Stringer (1968) do indicate the importance of situational factors in performance 
predictions with the TAT, however, confusing results emerged from a study by McKeachie 
(1961). 

There have been a number of studies relating to other aspects of the achievement motivation 
theory outlined by Atkinson & Feather (1966) which concern the construct validity of the TAT. 
One of the most researched issues here concerns the ‘moderate risk’ hypothesis. This states, in 
its general form, that when performance is cónfined to achievement related activities individuals 
in whom the motive for success (i.e. nAch) is greater than the motive to avoid failure will select 
tasks of intermediate difficulty with greater frequency than individuals in whom the motive to 
avoid failure is greater than the motive to approach success. Several investigators have tested 
this hypothesis using the TAT as the nAch measure and the Mandler-Sarason Test Anxiety 
Questionnaire (Mandler & Sarason, 1952) as an index of the motive to avoid failure. Support for 
the hypothesis is reported by Litwin (1958), Atkinson & Litwin (1960), Mahone (1960), 

Burnstein (1963), Littig (1963), Moulton (1965), Atkinson & O'Connor (1966), Morris (1966) and 
Raynor & Smith (1966) (in games of skill). No support was found by Raynor & Smith (1966) (in 
games of chance) nor by de Charms & Davé (1965). In a detailed examination of the risk- 
taking hypothesis Weinstein (1969) found support on only two out of nine risk preference 
variables. 

Some studies on the Atkinson hypothesis have not used a separate measure of the motive to 
avoid failure, but have assumed that a low score on the TAT implies a strong motive to avoid 
failure. This technique has been employed by Scodel, Ratoosh & Minas (1959), Meyer, Walter & 
Litwin (1961) and Crockett (1962), all of whom find support for the hypothesis. 

On balance, many of the findings from the risk-taking studies which have used the TAT do 
show a fair amount of consensus in upholding the moderate risk prediction, but this has often 
been amongst studies which have used a limited range of experimental performance tasks (e.g. 
connecting dots, puzzle tracing, ring toss). Weinstein's (1969) study showed that support did not 
tend to generalize across a number of very different risk preference variables (but the low coding 
reliability in this study could weaken this finding - see footnote on p. 5). Furthermore, 
comparison between different studies is hampered by the fact that an intermediate risk is defined 
in many different ways. As noted by Atkinson & Feather (1966, p. 357), ‘Accurate specification 
of intermediate risk is the most important prerequisite for testing some of the basic implications 
of the theory of achievement motivation.' This point is reinforced by de Charms (1968, p. 221), 
and more recently by Hamilton (1974). 

One particular technique of validation for the TAT, specifically favoured by McClelland et al. 
(1953), is the arousal of nAch through variations in experimental procedures. Their own evidence 
seems to support the hypothesis that nAch arousing conditions will be directly reflected in TAT 
scores — but other work by Atkinson (1953), Peak (1960), Smith (1961) and Tedeschi & Kian 
(1962) (all noted by Klinger, 1966) fails to replicate this phenomenon. 

A rather different approach has been taken recently to the question of the TAT’s construct 
validity. Atkinson & Bongort (1975) have used a computerized representation of the quantitative 
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achievement motivation theory described by Atkinson & Raynor (1974) in order to predict the 
time that hypothetical subjects will spend in imagining achievement activities. The input 
variables were the nAch levels of the hypothetical subjects and the achievement incentive values 
of each of a series of TAT pictures. The theoretically generated time spent thinking about 
achievement on each card was treated as the ‘true’ nAch score of the cards, and the sum over 
all of the cards in a series represented the total ‘true’ nAch score. From 25 computer 
simulations the researchers show that there is a very good ' postdiction' of the initial differences 
in strength of the achievement motive (as originally fed into the computer) from the true total 
scores, even when the TAT internal consistency (calculated from the true nAch scores of each 
card) are low. They see this finding as demonstrating the construct validity of the TAT, and that 
such validity does not require internal consistency. 

There is a prodigous amount of other work where the TAT has been employed in exploring 
predictions from achievement motivation theory. Most noteworthy of more recent contributions 
are Weiner (1972) on the effects of success and failure on performance and Raynor (1969, 1970) 
on the role of future orientation. It is not the intention of this writer to undertake a detailed 
critique of these studies, nor of the many other different investigations which are to be found in 
a variety of sources (e.g. Atkinson, 1958a; McClelland, 1961; Atkinson & Feather, 1966; 
McClelland & Winter, 1969; Atkinson & Birch, 1970; Atkinson & Raynor, 1974). However, if we 
take the consistencies evidenced in some of this work together with the results of the risk-taking 
studies, and also the interactional studies of performance, it is difficult to lightly dismiss the 
claim that TAT has demonstrated some degree of construct validity. 

Klinger’s (1966) review of fantasy need achievement includes the results of published 
performance validity studies for the FTI. Of 14 relationships, nine are statistically significant. 
Four studies using the FTI support the Atkinson moderate risk hypothesis (Atkinson, Bastian, 
Earl & Litwin, 1960; Atkinson & Litwin, 1960; Isaacson, 1964; Hamilton, 1974), but in 
Weinstein’s (1969) study confirmation was evident on only four out of nine different risk 
preference tasks. (Weinstein (1969) reports higher scorer agreement on the FTI than he found for 
the TAT. The initial classification of the main imagery categories achieved a 94 per cent 
agreement between the two teams of scorers, and the rank correlation reliability coefficient 
was 0-84.) 

The IPIT also does not fare very well in the Klinger review, with only 5 out of 13 
relationships being statistically significant. Heckhausen (1967, p. 17) claims validity for his 
technique from correlations with behavioural indices; though details are not presented. The 
validity of the Graphic method is shown by one finding of Aronson's — a 0-50 correlation 
between his instrument and improvement on a scrambled words test. Also McClelland (1958 b) 
demonstrates that a sample of pre-school children who score high on the Graphic take 
moderate risks, whilst low scorers take high or low risks. There is no reported validity for the 
Tartan technique. ^ 

In sum, the direct performance validity of the TAT is poor, yet as an interactional predictor of 
performance and as a predictor of several aspects of achievement motivation theory, it appears 
to possess construct validity. With the exception of the FTI, the validity evidence for the other 
projective measures is not very encouraging. 

Turning to the criterion validity of questionnaire measures of nAch the picture is generally a 

. little brighter. Table 4 summarizes the criterion validities reported in the major source references 
on each measure. Validity has been best demonstrated for the CPI (Ac and Ai), PRF, MAS 
(short form), SAS, and AAMI. Little performance validity is evident for the SDI, SCT and 
vAch, and none is reported for the EPPS, ACL, CAMS, or RAMQ. Commenting upon the 
validity of the EPPS Lake, Miles & Earle (1973, p. 19) say: ‘The EPPS has received 
considerable criticism for its insufficient validation to which Edwards in his revised manual has 
not replied.’ And of the ACL they say ‘The validity data are spotty and unreassuring ' (p. 2). 
The value of conceptualizing performance as an interactional phenomenon is nicely illustrated by 
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Table 4. Criterion validities presented in major source references to the questionnaire measures 


Criterion 
School grades 


Staff ratings 

School grades 
Success-potential rating 
Course grades 


Jackson (1967) Behaviour rating 


of nAch 

Measures Source 

CPL. Gough (1964) 

CPI,, 

PRF 

SDI Ghiselli (1971) 

MAS Mehrabian 

(long (1968, 1969) 

form) 

MAS, 

(short 

form) 

LAMQ Lynn (1969) 

HAMQ  Hermans 
(1970) 


Trait rating form 
Self-rating 


Job success 


Occupational level 


Zeigarnik effect 


Match stick problem: 

No. of problems 
attempted 

No. of problems 
solved 


Known groups 
Pursuit rotar: Number 
of hits 


Academic performance 
Exams completed 


Grades 


Sample 


1235 females 
946 males 


100 military officers 
220 students 

40 students 

917 students 


94 students 

36 students 

40 students 

51 students 

202 students 

40 students 

51 students 

202 students 

177 managers 

87 line supervisors 

152 line workers 

34 unskilled, 69 semi- 
skilled, 64 skilled, 
157 foremen, 102 clerks, 
177 middle managers 
113 upper managers, 
57 professionals 


205 males 


57 males 


50 females 
57 males 
50 females 


200 students 

40 entrepreneurs 
28 professors 

45 managers 

32 male students 


35 male students 
80 students 
38 students 
80 students 


38 students 


Relationship with criteriont 


0-41** 

0-41** 

0-30** 

0-44** 

0-31** 

0-38** 

0-50** 

0-62** 

0-53** 

0-52** 

0-46** 

0-55** 

0-42** 

0-65** 

0-24** (av) 

—0-07 (av) 

—0-08 (av) 

Progressive increase in 
SDI score the higher the 
occupational level (no 
statistical test) 


Significant tendency 
favouring high nAch 
people 


0-38** 


0-23 

0-37** 

0-32* 

Students score 
significantly lower than 
the other groups 


p = 0-08, 0-11 
(neutral condition) 
p = 0:32, 0-41* 
(ach condition) 
p — 0:14 
(non-ach condition) 
p =0-57** 
(ach condition) 
p =0-13 
(non-ach condition) 
p = 0-34* 
(ach condition) 





* P< 0-05; ** P< 0-01; f r, unless noted otherwise. 
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Table 4 (cont.) 


Measures Source Criterion 
SCT Mukherjee Psychology examination 
(1965) 
Thurstone's Primary 
Mental Abilities, plus 
; other cognitive and 
spatial tests 
vACH de Charms Scrambled Words Test 
et al. (1955) 
ARPS Weiner (1966) Zeigarnick effect 
Kasl (in Slope of reported 
Atkinson & satisfaction with 
O'Connor, occupations at 
1966, p. 302) different status 
levels 
SAS Sherwood Scrambled words test, 
(1966) plus digit adding 
test 
AAMI  Entwistle Teacher's motivation 
(1968) rating 
Teacher's estimate of 
attainment 
SAMM Smith (1973) Known groups 
EPPS Edwards (1959)— 
ACL Gough (1960) — 
CAMS Costello (1967) — 
RAMQ Argyle & — 
Robinson 
1962 


Sample 


87 students 

(51 male, 36 female) 
87 students 

(51 male, 36 female) 


45 females 


33 male students 
37 female students 


33 employees 


37 male students 
30 female students 
80 male students 


Relationship with criteriont 
0-12, 0-10 


4 out of 24 r's significant. 
Highest r — 0-26 


No,significant difference 
in performance between 
high and low vAch subjects 


Significant tendency 
favouring high ARPS 
males only 

0-51** 


0-48* 
0:37* 
Q.45* 


41 (high) pupils vs. 67 (low) Significant difference 
12 (high) pupils vs. 38 (low) favouring high AAMI 


158 pupils 
1385 boys: age 13 


44 men from ' Who's 
Who' vs. 89 non- 
exceptional men 

The ' Who's Who’ group 


Median p — 0-38** 
0-50** 
0-41** 
0-41** 
0-36** 
Significant difference 
favouring *Who's Who' 
sample 
Significant difference 


broken down into Business favouring ' Business and 


and Commerce vs. 
Universities and Civil 
Service 


Commerce’ sample 








* P<0-05; ** P<0-01; f r, unless noted otherwise. 
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the performance validity for the HAMQ in achievement-oriented conditions (shown in Table 4). 
But in another interactional study, using the EPPS (Pritchard & Karasick, 1973), no interactions 
were found. 

The test score differences between ‘known groups’ are quoted as validity for the LAMQ, 
SAMM, and the SDI. The superior recall of incomplete tasks by high nAch persons (Zeigarnik 
effect) has been cited as supporting the validity of the MAS (long form) and ARPS. The ARPS 
has also been found to correlate with the satisfaction of different status-level employees. 
However, these few findings provide only weak evidence of the construct validity of the 
respective measures. 

Overall, there seems sufficient validity evidence to be reasonably confident that six 
questionnaire measures - the CPI (Ac and Ai), PRF, MAS (short form), SAS, AAMI, and 
HAMQ - operate in a way which would be expected of nAch measures. But only two of these, 
the PRF and MAS (short form), have good reported internal consistency and stability. It appears, 
therefore, that the vast majority of questionnaire instruments do not survive a close scrutiny of 
their psychometric properties. 


nAch as a conscious or unconscious variable 


One reason why projective measures of nAch, such as the TAT, tend not to correlate with 
questionnaire techniques could be that the questionnaires are unable to tap the ‘real’ BREA 
whereas the projective measures are able to do so. McClelland is a strong advocate of this view. 
He emphasizes that one cannot be sure that a respondent’s choices on a questionnaire are 
‘dictated by motive only and not by, say, the conscious (and often inaccurate) self picture’ 
(McClelland, 1958a, p. 25). He also stresses the likelihood of conscious distortion of 
questionnaire results so that the subject may present himself in a good light. McClelland finds 
the answer to the questionnaire difficulties by turning to the analysis of fantasy, as developed 
through the psychoanalytic school of motivation and the TAT. He says: 


Here (in the measurement of motives) one may be guided by the experience of the psychoanalysts. After all, 
theirs is largely a dynamic or motivational psychology based almost exclusively on a single method of 
measurement, e.g. content analysis of free associations produced in the therapeutic session. Perhaps it is the 
method of measurement - informal and intuitive though it often is, to be sure - which led the best 
understanding of human motivation as yet available. It was this argument that led to an attempt to use more 
formally and systematically the psychoanalytic method of measuring motivation (McClelland, 1956, p. 332). 


Basically then, McClelland assumes that individuals are unlikely to be able and willing to 
report accurately on their nAch level in questionnaires; but not so in the TAT. The motive 
therefore becomes, in effect, unconscious, so precluding measurement by direct ‘conscious’ 
methods. This type of problem is noted by Holmes, as follows: ‘Unfortunately, once a trait is 
designated as unconscious it must then be assessed through the use of very time-consuming and 
often unreliable projective techniques’ (Holmes, 1971, p. 23). 

It should be stressed that McClelland does not actually define nAch as being unconscious - but 
his designation of the TAT as the only really appropriate measure means that it is treated as if it 
was an unconscious motive. It is interesting to note that Murray, who has clearly heavily 
influenced McClelland, is somewhat more charitable towards the questionnaire measurement of 
needs. He states that there are conscious needs which are susceptible to questionnaire analysis 
(Murray, 1938, p. 186), and he has himself used the questionnaire measurement of nAch, along 
with more psychodynamic techniques (Murray, 1938, chapter 3). The fact that Murray does not 
take an extreme ‘unconscious’ view has surprised some commentators in the field, such as 
Gardner Lindzey (1958, pp. 20-22). 

There does not seem to be a clear theoretical case for treating the nAch concept as an 
exclusively unconscious variable, and therefore it should be amenable to direct measurement. 
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Indeed, Holmes (1968), after reviewing the concept of projection, finds that it is only 
characteristics of which the individuals are conscious that are in fact projected in measures such 
as the TAT. In some-respects it seems very logical to assume that one’s nAch should be easy to 
verbalize. Achievement issues are very much part of the socio-cultural scene in Western society. 
The concepts of success, ambition, hard work, and attaining goals are common to our everyday 
vocabulary; we are often judged by ourselves and others in these terms. It would be surprising, - 
therefore, if people were unable to make a reasonably accurate self-assessment on achievement 
dimensions. There will be individual exceptions to this, where achievement notions are 
supressed or repressed for psychopathological reasons. But such occurrences should be relatively 
rare. 
Although an individual may be able to report on his nAch he may, as McClelland points out, 
V. be unwilling to do so accurately in a particular situation. This may be seen as a test-taking 
phenomenon when the subject knows that it is in his interest to give as socially desirable a 
picture of himself as possible. This is a particular problem for the questionnaires methods, and 
one which points to need for suitable desirability controls. The TAT will be generally less 
susceptible to such influences, but as some studies have indicated, it is by no means totally 
immune to voluntary distortion (Weisskopf & Dieppa, 1951; Summerwell, Campbell & Sarason, 
1958; Kaplan & Eron, 1965). 


The future of nAch and its measurement 


It was argued earlier that the nAch construct was definitive and appealing to applied researchers. 
The tangle and confusion in the operationalization of the motive should not diminish the 
plausibility and worth of the construct. Our attention should turn to better ways of 
measurement, and here progress will be slow if measures are used which have poor reliability 
and validity, or where such psychometric issues have not been fully researched. The present 
review suggests that projective measures — and in particular the TAT - cannot be fully justified 
on conventional psychometric grounds: but it is exactly these criteria that the nAch 
traditionalists reject as inappropriate. Also, as pointed out previously, there is evidence in the 
large body of nAch literature to indicate that the TAT does predict a variety of behaviours, 
therefore it would be rash to totally dismiss it as a poor instrument. 

How can one resolve this apparent paradox of an instrument which is very unreliable, yet 
does seem to predict something? An interesting perspective on this is provided by de Charms 
(1968, pp. 208-213). He points out that the TAT will not scale, in the traditional sense, as 
different scores can indicate different types of achievement motivation. Hence one person may 
write achievement stories about inventors, doctors and businessmen, and another might write 
achievement stories only about school situations. The former person may show more general 
achievement behaviour than the latter person, however the latter person will show much higher 
school achievement behaviour. In its current form the TAT should be considered as a measure 
of *extensity' of nAch - the concern for achievement in a range of different situations. It is not 
measuring the intensity of motivation in a specific setting. In de Charms' words: 


From the point of view of measurement theory and scalability, it may be said that the n Achievement measure 
is not very respectable. But measurement and scaling theory are based on the concept of measurement of 
intensity with the goal of prediction of specific responses. Apparently it is a mistake to use the achievement 
motive measure as a measure of intensity with the goal of relating to some specific other measure. The 
measure does not partake of the virtues of unidimensionality and scalability as prescribed by measurement 
theory (de Charms, 1968, p. 212). 


This means that the TAT should not be used for predicting specific task performance but for 
more general patterns, or 'success in life'. De Charms points to studies on the prediction of 
career patterns (McClelland, 1965) and of economic success in various cultures (McClelland, 
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1961; de Charms & Moeller, 1962) as illustrative of appropriate use of the TAT. One logical 
conclusion from this argument is that a set of TAT pictures designed for intensive measurement 
only should demonstrate scale properties. De Charms (1968, p. 212) reports some success in this 
respect using pictures portraying school settings alone. Another conclusion is that studies that 
experimentally test aspects of achievement motivation theory using specific tasks may do better to 
use a TAT comprising pictures with content as relevant as possible to the task to be performed, 
rather than the conventional version of the TAT. 

Rigid adherence to the TAT as the measure of achievement motivation has some specific 
non-psychometric disadvantages which limit progress in the field. While college students may be 
paid, or otherwise coerced, to complete the instrument in an experimental setting, such 
compliance is rarely obtainable in other organizational settings. In industry for example, the 
technique can all too readily be perceived as ‘silly’ and ‘irrelevant’, and even embarrassing for 
the psychologist and subject. Steps towards creating organizationally relevant pictures would 
improve this situation; nevertheless the actual testing procedure is potentially alienating, and 
often administratively very inconvenient. To ignore such issues really means to ignore the types 
of real settings where laboratory findings must, ultimately, be cross-validated. 

The evidence for the questionnaires, unlike that for the projectives, indicates that reliability 
and validity is attainable, but currently it is rare to find both criteria met (or researched) in any 
one instrument. Several issues suggest themselves if this situation is to be improved: 

(a) Conscious reporting of nAch. In principle people should be able to report on their level of 
nAch, therefore the questionnaire technique is a perfectly appropriate measurement tool. 

(b) Item domain. Validity (and reliability) is partly dependent on how well the instrument 
samples descriptions of nAch behaviour or attitudes. Most of the existing questionnaire 
measures are either explicitly or implicitly designed to tap general achievement strivings, yet 
their actual content varies enormously. It is unlikely that a three or eight item instrument will do 
full justice to the breadth of the concept (e.g. SAS, and LAMQ). Ideally items should be culled 
directly from the theoretical definitions of the construct, and from as many different consistent 
research findings in the areas as possible. If the measure is to be constrained to specific areas of 
achievement, such as school or job, the items still need to reflect the variety of achievement 
facets to be found in that particular domain. However, such measures should not be expected to 
correlate highly with more generalized instruments. 

(c) Response distortion. Only some of the existing questionnaire measures attempt to control 
for deliberate distortion of responses (the CPI, SDI, PRF, SCT and MAS). The socially desirable 
response to an nAch item is often very obvious to a subject who wishes to give a good 
impression of himself. This becomes particularly important in assessment situations where the 
subject knows he is to be evaluated, and that the ‘evidence’ may be used for or against him. In 
non-assessment settings it such as ‘How often do you seek opportunities to excel’ (Argyle & 
Robinson, 1962) and, \A¥6uld you describe yourself as an ambitious person?’ (Costello, 1967), 
have the virtue of being simple, straightforward and should elicit a reasonably honest response. 
(Unless, of course, a person has a strong individual predisposition to give socially desirable 
responses whatever the situation (Edwards, 1957; Crowne & Marlowe, 1960).) In evaluative 
circumstances, however, it is very easy for the most naive of subjects to fake responses to such 
items in order to present a socially desirable image of himself. This will influence the validity of 
the measure, and may well be a contributory factor to the lack of association between it and 
other nAch instruments. 

If an nAch measure is to be used in evaluative settings it should contain a control for social 
desirability. Indeed, it may be wise to assume that all uses of the measure are likely to evoke a 
certain degree of response distortion of this sort since testing will always be ego threatening to 
some extent. A control is thus always preferable. This may be in terms of equating pairs of 
statements for average social desirability value (Guilford, 1954; Edwards, 1957), or selecting 
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single-choice items which have low-to-zero relationship with a desirability scale, a technique 
which has been employed by Jackson (1967). Another method is to detect social desirability or 
faking good, in a testee’s responses, rather than directly control for them: the Minnesota 
Multiphasic Personality Inventory (Hathaway & Meehl, 1951) and the Eysenck Personality 
Inventory (Eysenck & Eysenck, 1964) employ ‘lie scales’ to do just this. These devices are not 
without their problems (e.g. Heilbrun, 1958, 1959; Scott, 1963, 1968), but they seem to be a step 
in the right direction. 

(d) Generalizability of the scale.. The existing questionnaire measures (and projective 
instruments) have nearly always been developed on student populations - the vast majority of 
them being American male high school and college students. Seldom are non-student 
standardization data available, and face validity can be very low for non-student populations (see 
Fineman, 1975a). Achievement motive measures developed in such a way may be invalid 
beyond the immediate standardization group, and demonstrations of cross-validity are required. 

It is particularly remarkable that very little scale development has taken place with groups of 
managers, given that McClelland has devoted much time and energy to showing the importance of 
the nAch syndrome in managerial and entrepreneurial behaviour. No existing scales contain 
items which appear likely to be fully acceptable to industrial managers ih terms of criteria such 
as the job relevance of item-content, ease and quickness of completion, and suitability for 
self-administration. The SDI, ACL, MAS, CAMS, LAMQ, vAch, RAMQ and SAS are 
reasonably acceptable in these terms, but only two of them - the SDI and LAMQ - have actually 
employed managers in the standardization procedure. Ideally, a scale for measuring the nAch of 
managers should be technically adequate and organizationally acceptable. Most existing measures 
seem to have neglected or simply ignored issues of acceptability. This can be a dangerous tactic 
as acceptability factors will directly influence the psychometric properties of a scale. 

The writer has attempted to face up to some of these issues in the construction of an nAch 
scale designed specifically for managers (Fineman, 1975 a). The scale, The Work Preference 
Questionnaire (WPQ), has 24 items in forced-choice format. Each item comprises a pair of 
logically related statements about an ‘ideal organization’ or ‘ideal boss’, matched for social 
desirability. The internal consistency of the scale is 0-68 (KR20), and one-year stability r’s are 
0-58 and 0-55. Small significant relationships with managerial performance have been found 
(r's = 0-25 and 0-20) and other construct validity data are reported. The influence of situational 
factors on performance predictions with the WPQ have also been explored (Fineman, 19755). 

The author has attempted to improve upon Ghiselli’s empirically derived Self-Description 
Inventory by modifying it for the measurement of nAch alone (Fineman, 1976). This has 
capitalized on the convenience and social desirability control of those of Ghiselli’s items which 
bear some relevance to the theoretical nAch construct. Eighteen items define the revised scale. 
The KR20 internal consistency for a managerial sample is 0-70, and the one-year stability r's are 
0-79 and 0-56. Significant r's of 0-37, 0-14 and 0-13 were found when correlating the scale with 
managerial performance. As with the WPQ, other construct validity information is reported. 


Conclusions 


The present review suggests that the many different published measures of nAch are not tending 
to measure the same construct. The most commonly used projective instrument, the TAT, 
clearly fails when it is judged against certain traditional psychometric criteria, however, this 
disguises the fact that it may be an appropriate measure of the extensity, rather than intensity of 
achievement motivation, and also that its performance validity is better viewed in interactional 
terms rather than in terms of simple linear relationships. It also appears to show construct validity 
when used for particular predictions derived from achievement motivation theory. 

In the realm of questionnaire measures of nAch there appears to have been an all-too-ready 
eagerness to develop new devices without sufficient thought about (a) the richness of the nAch 
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construct, (b) other measures in the field, (c) response biases, and (d) face validity. A simplistic 
‘tidy’ measure may initially satisfy the psychometrician but it can often strike the respondent as 
naive, inappropriate and alienating. The problem for the test constructor is to balance the 
structured nature of the questionnaire with the more ambiguous ‘real’ world of the respondent. 
If measures are designed with specific populations in mind this balance will more likely be 
achieved. Such a strategy need not compromise the nAch construct if items are chosen 
carefully. A measure for managers may have differently phrased items than one for students, but 
the range of nAch issues reflected in the two can be equivalently represented. This approach is 
probably preferable to a generalist one where items are written either to be suitable for any 
respondents or specific to one population (e.g. students) but supposedly ‘answerable’ by others. 
If the study of nAch is to be meaningful we have to know that the motive is in fact being 
consistently measured. The present analysis of our past behaviour in this area suggests that it is 


a question we have tended to take too much for granted. 
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Inappropriate constancy scaling as a factor in the Müller— Lyer illusion 


L. C. Morrison 


The inappropriate constancy theory of geometrical illusions assumes the existence of implicit depth cues 
which automatically set constancy scaling. In the present study the phenomenon of cue conflict has been 
employed to test for latent depth cues in the Müller-Lyer figures. 

The technique consisted in introducing a stereoscopic disparity between the figures by means of an 
aniseikonic lens and measuring the perceived depth between the figures. The magnitude of the stereoscopic 
effect depends on whether the perspective cues augment or inhibit the stereoscopic depth cues. The results 
were inconsistent with the misapplied constancy theory, but confirmed Pike & Stacey's (1969) observation 
that the illusion of size may act as a depth cue. 





Inappropriate size scaling has been proposed by Gregory (1963, 1972) to account for the 
Müller-Lyer and certain other geometrical illusions. His theory assumes that the illusion figures 
carry implicit depth cues, but depth as such is not normally perceived owing to the inhibitory 
effect of the textured background on which the figures are drawn. However, it is assumed that 
these depth cues automatically set constancy scaling whether or not depth is perceived in the 


figures. 


i] 


Figure 1. Müller-Lyer illusion. 


In the case of the Müller-Lyer illusion (Fig. 1) the converging and diverging fins at the ends of 
the two lines are presumably effective as perspective depth cues. The apparently shorter line (a) 
may be regarded as a proximal intersection of two receding planes which are suggested by the 
perspective of the fins at the extremities of the line. The apparently longer line (5) with the 
diverging fins suggests the shaft to be at a greater distance than the tips of the arrows and may 
be regarded as the distal intersection of two receding planes, e.g. the corner of a room. Since 
the line would normally be perceived as relatively distant it is perceptually expanded. 

Gregory (1972) has constructed luminous wire models and photographic transparencies of the 
geometrical illusions which were observed monocularly in a dark room. In these circumstances 
the figures were perceived in depth as would be expected from the theory. 

Pike & Stacey (1968) give two interpretations of the inappropriate size scaling theory: (a) that 
in the absence of a competing background, the perspective carried by the fins of each figure 
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would set the apparently larger figure at a relatively greater depth, and (b) the perspective acts 
independently on each figure and is effective only in setting the shaft in depth relative to the fin 
tips, but not in establishing relative depth between the figures themselves. It is the second 
interpretation which Gregory seems to hold. 

Criticisms of the theory have been based generally on the observations of figures which 
present perspective yet produce no size illusion, or the illusion does not conform to what would 
be predicted from the depth cues (e.g. Brown & Houssiades, 1965; Day, 1965; Zanforlin, 1967). 
Hotopf (1966) gives evidence that the figures carry ambiguous depth cues and some subjects 
experience a reversal in depth. 

In Pike & Stacey's experiments with luminous Müller-Lyer figures which were viewed 
monocularly in a dark room, the majority of observers experienced the apparently smaller figure 
at a greater distance. This is inconsistent with the relative depth interpretation (a) of the theory, 
but neither confirms nor refutes the second interpretation (b). It may be argued in this case that 
the primary scaling set by the fins of the figures may itself be interpreted as a depth cue which 
sets the apparently smaller figure at a greater distance.* 

However, the observation of luminous figures in a dark room may be unsatisfactory as a test 
of Gregory's theory owing to the ambiguity of the situation and the possible intrusion of 
suggestive influences. If the fins of the figures act as latent perspective cues which are 
responsible for primary constancy scaling perhaps they may be made explicit by introducing a 
reinforcing depth cue, for example, stereoscopic disparity between the figures. In the present 
study this principle has been utilized as a method of testing the theory. The stereoscopic 
disparity was introduced by observing the figures in an unstructured space with an aniseikonic 
lens. The stereo-cues were presented in either of two ways: (a) in conformity with the presumed 
perspective, and (b) in conflict with it. In each case the perceived depth between the figures was 
measured. 

The theory. of aniseikonic lenses and their perceptual effects is adequately dealt with in 
textbooks (e.g. Ogle, 1950). It will suffice to mention that they are essentially small afocal 
telescopic systems which magnify the retinal image and therefore, if placed before one eye, will 
modify the binocular stereoscopic cues. The magnification is usually confined to a specific 
meridian which is at right angles to the axis of the lens. 

If a surface in a subject's frontal plane is viewed with an aniseikonic lens (axis vertical) before 
his right eye, the surface will appear to tilt away on the right. The effect is depicted in Fig. 2, 
where a plane ABC will appear to rotate through the angle y. Moreover, those aspects of the 
surface which are perceived at a greater distance are automatically up-scaled so that a figure will 
appear to distort; a rectangle, for example, will appear as a trapezium. Gillam (1968) bas shown 
that if the surface contains empirical depth cues such as parallel lines, the amount of perceived 
tilting will be significantly reduced. This reduction in the tilting of the figure is due to the conflict 
between the empirical cues on the one hand and the stereoscopic cues on the other, the former 
tending to dominate the situation and to inhibit the stereoscopic tilting of the surface. The 
perceived tilting of the surface may also be enhanced by drawing perspective lines on the 
surface to conform with the tilting that is indicated by the stereo-cues. 

Now if the Müller-Lyer figures are viewed with an aniseikonic lens before one eye, the 
amount of stereoscopic depth that is perceived between the two figures might be expected to be 
influenced according to whether the perspective conforms or conflicts with the stereo-disparity 
which the lens introduces. According to Gregory the diverging fins of the apparent larger figure 
are implicit perspective cues which set the central shaft relatively more distant. Hence, if this 
figure is viewed on the same side as the lens is worn, the perspective cues would presumably be 
in the direction of the stereo-cues and a stable perception would be expected. On the other 
hand, when the other figure is placed on this side, the cues would presumably conflict and a 


* Compare the phenomenon of reversed aniserkonic tilt (Gillam, 1967) 
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Figure 2. The stereoscopic effect of an aniseikonic lens. 


reduced aniseikonic effect might be expected. Or, alternatively, what often happens in a 
conflicting situation, is that the perception becomes unstable and the figures may tend to 
alternate in space making a judgement difficult or impossible. This technique has the advantage 
in that it is relatively free from suggestive influences since questions concerning the relative 
distances of the two figures were not necessary. The subjects were completely ignorant of the 
purpose of the experiment. 


Procedure 
Apparatus 


The set-up was similar to the tilting plane apparatus used by Ogle (1950). Müller-Lyer figures were drawn on 
a transparent plastic sheet which was slotted into a supporting frame. Each subject with his head steadied in 
a rest viewed the test figures which were seen against a uniformly illuminated background. Suitable screens 
restricted the subject's view of the test figures; the edges of the supporting frame were not visible to the 
subject. 

The illusion figures were drawn in indian ink on a transparent sheet of plastic, the central shafts of the 
figures were 16 cm in length and 2 mm in thickness. For the figure with the ingoing fins, the fins were drawn 
at an angle of 45? to the central shaft. This angle was 135? for the figure with the outgoing fins. Each fin was 
6 cm in length and the figures were separated by 12 cm. The centres of the figures were adjusted to eye level 
and the observation distance was 1 m. 

A technique due to Seagrim (1967) was used to measure the perceived depth between the illusion figures. 
Situated directly below the figures and at a distance of 10 cm in front of the test plane was a horizontal 
Meccano strip 4 in in length which served as a comparison object. The strip could be rotated around a 
vertical axis. This was achieved by means of a crank placed on the right side of the subject which was 
connected to the axle supporting the strip by means of worm gearing. Below the Meccano strip, and not 
visible to the subject, was a protractor which indicated the angle at which the strip was set from the frontal 
position. 

. A 5-00 per cent aniseikonic lens (i.e. magnification 1-05 times) was attached to a holder fixed to the front 
of the head rest. For symmetry a piece of plain glass was placed before the other eye. The lower portion of 
the size lens was removed so that when the subject lowered his eyes he could view the Meccano strip free 
of the lens. The task of the subject was to rotate the strip by means of the crank until it appeared to be 
parallel to the vertical plane which contained the illusion figures when they were viewed through the lens. 
The angle at which the Meccano strip was set from the frontal position was a measure of the depth effect 
produced by the aniseikonic lens. For technical reasons the size lens was always placed before the subject's 


right eye. 
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Table 1. Subjective tilt of figures 
—————————————M——MM—M————— 


Subjective tilt of figures (Deg) 





Difference 

Subject 1* 2i 3t 4i 5 
SS a a a aa a a 
P.Y 31-5 1:39 29-8 1:35 1-7 
T.T 20-5 1-27 13-7 1-36 68 
B.R 32.8 1-81 38-3 2-81 -5-5 
C.R 25-4 2-71 11-9 1-73 13-5 
J.D 32-9 2-54 30-8 1:26 24 
D.K 23-5 2-17 24-4 2-99 -0-9 
C.P 12-4 1-78 10-9 1-88 1-5 
E.L 30-7 0-96 31-4 1-59 —0-7 
J.P 27-9 241 22:4 2-74 55 
M.P 20-1 0-90 17-9 2-61 2-2 
J.G 27-6 1-93 16-6 2-10 11-0 
K.A 31:0 1-42 34-4 1-44 —3-4 
F.D 30-6 3-34 25-0 3-77 5-6 
K.B 24-7 1-14 16:7 1-48 8-0 
H.S 30-9 2-32 21-6 3-37 9.3 
R.E 30:3 1-42 25.8 3.09 4-5 
R.R 36-7 2-59 41-3 2-01 —4-6 
A.M 29-7 0-46 28-9 1-62 0-8 
S.R 30-2 0-37 28-8 1-35 1-4 
Means 27.8 1-717 24-7 2-134 3-1 








5-00 per cent size lens axis vertical before R.E. 

* Column 1 = Apparently smaller arrow on subject's right side. 
+ Column 3 = Apparently larger arrow on subject’s right side. 
+ Columns 2 and 4 give the standard deviation for each case. . 


Method 


Nineteen subjects (all students of optometry) took part in the experiment, their average age being 20 years. 
A few of the students were familiar with the effects of aniseikonic lenses but they were totally ignorant of 
the purpose of the experiment. Before embarking on the actual experiment the subjects practised with the 
apparatus by observing two continuous vertical lines; the ends were not visible. The task was to set the 
Meccano strip until it appeared to be parallel to the plane containing the two test lines. 

When they were familiar with the task and gave reasonably consistent readings, the plastic sheet depicting 
the vertical lines was removed from the frame and replaced by the sheet with the illusion figure. 

Each subject made ten settings of the comparison strip. After making a setting the subject notified the 
experimenter who noted the angle at which the Meccano strip was set. The subject then closed his eyes 
for around 30 sec while the experimenter moved the strip into some other position. On opening his eyes the 
subject repeated the procedure. After ten adjustments of the strip had been made a shield was placed in 
front of the subject and the plastic sheet was reversed in the frame which reversed the positions of the 
Müller-Lyer figures. There was no fixed order in which the test figures were presented. For some subjects, 
the figure with the ingoing fins was placed on the subject's right for the first series of measurements, but for 
other subjects the order was reversed. 


Results 

Table 1 gives the mean results of the 19 subjects. The first column contains the results when the 
apparently smaller figure of the illusion was on the subject’s right, that is, on the same side as 
the aniseikonic lens and therefore was perceived further away. The third column contains the 
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results when the other figure was on the subject’s right side. The second and fourth columns 
are the standard deviation of the two sets of observations, and the fifth presents the individual 
differences between the two series of settings, i.e. the differences between the figures in the first 


and third columns respectively. It is noted that, with the exception of five subjects (negative 
differences), a greater angle of perceived tilting is recorded when the figure with the ‘ingoing’ 
fins was situated on the same side as the lens. In other words, a greater depth effect occurred 
when the apparently smaller illusion figure was seen at a greater distance. This would suggest 
that in this presentation there is less conflict between the monocular visual cues and the 
stereoscopic depth cues. However, conflict was apparent when the larger figure was on the side 
of the lens. This is shown in the relative reduction of the perceived depth and the greater 
uncertainty in making settings which is evident in the larger deviation. The differences between 
the two sets of measurements are significant (P< 0-05, t test). 


Conclusions 


The statistical evidence of the experiment does not support Gregory's theory, but five of the 19 
subjects did give consistent readings. However, the 1968 observation of Pike & Stacey that the 
illusion of size may be effective as depth cue is confirmed. 

The reason for the minority response is not known, but it is conceivable that the subjects 
perceived the figures in depth as predicted by Gregory. In view of the evidence of contrary 
depth effects, the ambiguity of the figures (Hotopf, 1966; Holding, 1970) and the fact that the 
illusion of size is a stronger depth cue than the perspective in the fins, it would seem unlikely 
that primary scaling is the cause of the illusion. In this respect one is reminded of Humphrey & 
Morgan's (1965) claim that a theory which appeals to a hidden constancy mechanism manifested 
only in the illusions themselves is essentially untestable. 
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The meaning and measurement of memory imagery 
Alan Richardson 


Some contemporary trends in social and clinical psychology suggest that individual differences in imaging 
abilities may become increasingly important. This paper outlines some of the conceptual and methodological 
problems that must be clarified if productive research is to be undertaken and reports the results of three 
studies designed to isolate tests which might provide independent measures of imagery vividness and 
imagery control. 


The theoretical and practical importance of studying individual differences in imaging abilities is 
suggested by the work of several contemporary social and clinical psychologists. Allport (1968) 
defines social psychology to include imagined social situations as well as those which are implied 
or actually present and Elms (1966) and Cautela & Wisocki (1969) are among those who have 
investigated imagery vividness as a significant variable in the mediation of attitude change. 
Lazarus (1973) has given imaging a central place in the therapeutic process and other clinical 
psychologists like Mintz & Alpert (1972) and psychiatrists like Horowitz (1967, 1970) have 
demonstrated its relevance to understanding pathological forms of experience and behaviour. 

Mental imagery in all its forms (see Richardson, 1969) has become a problem area of 
widespread interest but like many other problem areas in psychology it is not yet free from a 
certain amount of conceptual and methodological confusion. This paper will clarify the main 
meanings that have been given to one class of mental imagery (memory imagery). It will be 
proposed that the traditional meaning of a memory image as a quasi-perceptual event which has 
adaptively useful functions should be preserved and that other meanings of memory imagery 
should have distinctively different names. Attention will then be focused upon the methodological 
problems of measuring the vividness and controllability of the memory image and some data 
relevant to these measurement problems will be presented and discussed. 


The conceptual problem 


Three contemporary usages will be considered. The first, and most extreme usage, denies that 
anything in the nature of a quasi-perceptual event is experienced and redefines the concept of 
image as a form of role taking. The second uses the concept of imagery to refer to a mode of 
memory representation which is distinct from the verbal mode, but which sometimes has overtones 
suggesting the traditional view of an image as something which has some kind of 

quasi-perceptual content. The third seeks to distinguish more clearly the unconscious memory 
representation from its conscious counterpart which is the image proper. 


Imagery redefined as role playing 

Theodore Sarbin (1970, 1972) believes that ‘Like so many other psychological concepts 
imagining as experiencing of pictures in the mind is an instance of that natural process of 
language development in which the figurative becomes assimilated to the literal...” (Sarbin & 
Juhasz, 1970). It is argued that the activity of imaging is no more than a form of ‘imitative’ or 
‘as if’ behaviour. Thus the patient in a mental hospital who says that she sees the mother of 
God is attempting to solve some personal problem ‘by forming the hypothesis that the mother of 
God is present’. The patient’s behaviour may be described, most accurately, he believes, as like 
that of someone who is seeing the mother of God. She is playing the role or imitating the 
behaviour of someone who is seeing the mother of God but clearly she is not really seeing 
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anything. To ‘really see something’ requires the presence of an independently verifiable stimulus 
object, therefore to talk in terms of seeing an imaged person or object is of necessity to be 
talking in figurative and not in literal terms. 

Of course, if ‘really seeing something’ is defined as involving the presence of an independently 
verifiable stimulus object then any other kind of talk about ‘seeing something’ in the absence of 
such a stimulus must necessarily be figurative. However, this argument is unsound logically and 
unsound empirically. It is unsound logically because it arbitrarily defines the experience of 
seeing something as requiring or including the presence of an independently verifiable stimulus 
object. This is equivalent to assuming what has to be proved and is obviously an invalid form of 
argument. It can be shown to be unsound empirically by citing the ubiquitous phenomena of 
after imagery and dream imagery. In a visual after-image, for example, there is an experience of 
‘really seeing something’ in the absence of any independently verifiable stimulus object. The 
fact that a stimulus object had been present a second before is not relevant. The fact is that the 
stimulus object is not present at the time the experience is taking place. The experience is ‘real’ 
to the person who has it and he would deny that his report of its qualities was in any way 
metaphorical. 

Though it is highly probable that role-taking ability requires imaging ability it is unwarranted to 
assume that the two abilities or the processes involved in their operation are identical. 


Imagery defined as a kind of representation 


Some researchers who employ the term imagery in their descriptions or explanations allow its 
precise conceptual meaning to remain ambiguous. For example, when Allan Paivio writes, '. .. 
the term imagery will be used to refer to a memory code or associative mediator that provides 
spatially parallel information that can mediate overt responses without necessarily being 
consciously experienced as a visual image’ (1971, pp. 135-136) he is clearly denying the 
relevance of any phenomenally present quasi-perceptual event; yet his chief method of 
measuring imagery vividness involves the use of concrete and abstract nouns which have been 
found to elicit varying intensities of ‘consciously experienced’ imagery. So, on the one hand it is 
argued that retention is better for concrete than for abstract nouns in a paired associate 
learning task because the former elicit more imagery than the latter, while on the other hand it is 
stated that this result does not require that a memory image he consciously experienced. 

Though this kind of conceptual ambiguity does not prevent valuable empirical research from 
being undertaken it may, as Pylyshyn (1973) has suggested, contain ‘misleading implications 
which carry over undetected into psychological theories’ (p. 2b). 


Imagery as conscious representation 


The distinction between the unconscious representation of perceptually encoded information (as 
opposed to verbally encoded information) and the conscious image of the same information is 
most clearly stated by Robert Holt (1972). He writes, ‘. . .conscious processes correspond to 
molar brain processes because they are different ways of looking at a complex unity; both types 
of process are real and equally dignified as a subject matter of science. In addition, however, 
this total event is one of information processing; the information which exists as consciously 
apprehended meaning under some circumstances is encoded in the biochemical and bioelectric 
events of the brain process, and as such is not dependent on consciousness for its continued 
existence...’ (p. 8). He goes on, ‘I propose, therefore, that we. . .tentatively accept the 
proposition that consciousness may make a considerable difference, and therefore use a different 
term (a) when we are speaking about a phenomenal content of a sensory or quasi-sensory 
nature, and (b) when we are speaking about the same meaning as mediated by (encoded in) a 
brain process without awareness. Let us reserve the term image for (a) the former of these two 
cases, and presentation for the second (b)' (p. 10). 
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It is with this definition of memory image as a consciously experienced representation — a 
quasi-perceptual event - that this paper is concerned. It would be a useful convention if all 
research workers maintained the conceptual distinctions suggested in this section when reporting 
the results of their studies. 


The methodological problem 


If the term ‘image’ is to be restricted to those quasi-perceptual events which are in some sense 

een’, ‘heard’, etc., by a subject the basic data must depend upon self report. Psychologists 
have been justifiably nervous when asked to accept data of this kind though its logical 
admissability, theoretical desirability and empirical utility have been amply demonstrated (see for 
example: Richardson, 1965; Singer, 1966; Hilgard, 1969; Hannay, 1971; Thayer, 1971). 

To be assured that self-report data on the vividness and controllability of the memory image 
are related unambiguously to their corresponding conceptual properties requires particularly 
close attention to the problems of reliability and validity. Let us take the Betts QMI Vividness 
of Imagery Scale as an example. This 35-item test was constructed by Sheehan (1967 a) and has 
been found to have high internal consistency within each modality as well as high correlations 
between modalities. (But see White, Ashton & Law, 1974, for their data and discussion on the 
importance of measuring modality specific imagery.) Its test-retest reliability over a seven-month 
period is 0.78 (Sheehan, 1967 b) and it correlated positively with ratings on a separate measure of 
visual imagery (0-50, P< 0-01) and negatively with the time taken for an image to reach its 
greatest clarity (—0-24, P « 0-05) in a study reported by Rehm (1973). ~ 

Some support for its predictive validity has been obtained in studies by Sutcliffe, Perry & 
Sheehan (1970) where it was found to have a significant but non-linear association with hypnotic 
susceptibility and in studies by Sheehan (1972) on accuracy of recall after incidental learning. 

Facts of this kind are reassuring but the sceptical critic rightly asks whether the reliability data 
might not result from the operation of one or more response sets and the validity data from the 
presence of some correlated cognitive or temperament factors (see Richardson, 1969, pp. 92, 
134). These questions plague many types of study in which individual difference variables are 
involved, e.g. repression/sensitization (Byrne, 1964) but the obligation to obtain evidence to 
justify one interpretation rather than another cannot be ignored. 

Even if it could be established that scores obtained on the QMI are an accurate reflection of 
the imagery experienced by a subject and that performance of other tasks is a necessary 
consequence of employing this ability to image, it would still be an advantage if equivalent 
methods of measuring imagery, objectively, could be found. 


Measuring the vividness and controllability of memory imagery 


The three studies reported below were designed to obtain information which bears upon some of 
the critical measurement problems associated with the use of self-report tests for assessing 
individual differences in the ability to form and control memory images. The ability to form 
images refers to the vividness or clarity with which they can be realized and the ability to 
control them refers to the ease with which an image can be altered or replaced by another. 


Aims 
Imagery vividness 

1. Are self-report measures of imagery vividness independent of other categories of cognitive 
and temperament tests? 

2. Does self-reported imagery vividness remain stable over time even when measured by 
somewhat different tests? 

3. To what extent do response sets influence the degree of imagery vividness reported in 
different situations? 
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4. Can any evidence be found for the existence of objective (performance) tests of imagery 
vividness? i 


Imagery control 
1. Are self-report measures of imagery control independent of other categories of cognitive 


and temperament tests? 

2. Does self-reported imagery control remain stable over time when measured by 
somewhat different tests? 

3. To what extent do response sets influence the degree of imagery control reported in 
different situations? 

4. Can any evidence be found for the existence of objective (performance) tests of imagery 


control? 


Method 


Though data relating to the aims outlined above were obtained from three separate studies only the first of 
these will be described in detail. The second study involved the administration of 20 tests to 23 male and 29 
female university students as part of their third’ year personality course. The third study involved the 
administration of 16 tests to 100 boys (aged 16-17 years) who were undergoing training at HMAS Leeuwin, a 
naval shore station in Western Australia (Parker, 1973). 


First Study . i m 
Details of the subjects, the variables and the administrative procedures are given below. 


Subjects. A total of 26 male (mean age: 19 years) and 34 female (mean age: 20 years) university students 
completed all the tests to be described. 


Variables. All 42 variables measured in this study are listed below in order of administration. 

1. Age 

2. QMI Vividness of Imagery Scale (see Richardson, 1969): Instead of the usual method of self 
administration the two practice and 35 test items were read out to the subjects. Each image obtained was 
rated on a seven-point scale (six for maximum vividness — zero for a total absence of imagery). The 

. vividness score is the total sum of ratings for all items in all modalities (range: 0-210). 

3. Concealed Figures: This is Form B of the Gottschaldt Figures test developed by Thurstone (1951). It 
comprises two practice items and 49 test items (range: 0-49). Time allowed: 10 mins. 

4. Position Memory: This test is based on procedures employed by Whiteley (1962) and Rim & Bottrell 
(1969). Subjects received the following instructions: ‘In a few moments a coloured slide will be projected 
onto the screen. It will contain 12 different objects and it will be exposed for 30 sec. Your task will be to 
look at it carefully so that you will be able to obtain a clear image of it when asked to recall the position of 
each object. Immediately after the first slide a second one will be projected onto the screen for the same 
period as before. Once again you are asked to look at it carefully so that you will be able to obtain a clear 
image of it when asked to recall the position of each object. After the 30 sec exposure of this second slide 
has elapsed your ability to recall the position of each object will be tested. The 12 objects shown in the 
second slide are all different from those shown in the first slide. Any questions?. . . Right. Here is the first 
slide.’ 

Ten seconds after the removal of the second slide subjects received the following additional instructions: 

* You will now be shown separate slides of each object that appeared in the first slide. Each of the 12 objects 
is numbered and your task is to image the position in which it was placed. Then write down the object’s 
number in the quadrant closest to that position.’ (The answer page consisted of a plain sheet of quarto paper 
bisected vertically and horizontally by ruled lines.) ‘Each object will be exposed for a period of five seconds 
and followed by a 10 sec period in which you should try to ‘‘see’’ the object in its original position as it 
appeared in the first slide.’ This procedure was repeated on a separate answer page for the objects on the 
second slide. The score on this test is the sum of the objects correctly placed (range: 0-24). 

5. Controllability of Imagery: This is a new task constructed by the writer. It was introduced with the 
following instructions: ‘The task that you are now asked to attempt consists in making a judgement between 
the ease with which you can conjure up and clearly retain the second image in each pair that will be 
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presented to you. Let me begin by giving you an example, by taking a pair of visual mages Your task will 
be to note down whether the second image in the pair is relatively easier to image, relatively harder to image 
or no different in its difficulty level from the first image in the pair. Ready?’ The first practice item illustrates 
the nature of the task for the visual modality. 

‘Obtain as clear an image as possible of a car standing in the road in front of a house.’ (Allow 5 sec.) 
‘Now obtain as clear an image as possible of the same car in the same place but lying upside down with its 
wheels in the arr.’ (Allow 7 sec.) ‘Was the second image in any way easier or harder to form and hold? Even 
if it was only a fraction harder or a fraction easier you should report the nature of the difference by 
underlining the appropriate answer.’ A second practice item and ten test items then followed. Each test item 
was succeeded by the question ‘Was the second image in any way harder or easier to form and hold than the 
first, or was there no difference?’ Where a response indicated that the second image of a pair was harder to 
form and hold than the first it was scored zero (0). All other responses to an item were scored one (1) A 
pilot study had revealed that only five of the items were positively correlated with each other. One each 
from the visual, olfactory and organic modalities and two from the kinaesthetic. The total score on this test 
was based on these five items (range: 0-5). : 

6. Number series: This is a 3 min 20-item unidimensional test of intelligence developed by Lumsden 
(1959). The total score is the number of correct responses (range. 0-20). 

7. Mill Hill Vocabulary Scale (Set A): This test contains 33 words each of which is presented in 
association with six other words. The subject's task is to select the correct synonyms. The total score is the 
number of correct responses (range: 0-33). 

8. Necker Cube Fluctuations: A cube of 25 mm side was presented on a page in the test booklet. Subjects 
were instructed to fixate the cube for 60 sec and while doing so to make a mark with a pencil each time a 
perspective reversal occurred. This procedure was repeated under two conditions, first with the subject 
‘willing’ as fast a rate of reversal as possible, and second, ‘willing’ as slow a rate as possible (see Gordon, 
1950). The score calculated for use in this study is the difference in number of reversals occurring under the 
fast and slow conditions. A low score is interpreted as indicating ‘rigidity’ or lack of perceptual control. 

9. Memory for Designs: The eight most difficult designs from the test devised by Graham & Kendall 
(1946, 1960) were reproduced on slides. Subjects were told that, ‘In this test, a slide will be exposed on 
which there will be a simple design. You are to study this design so that when it is removed, you can draw a 
copy of it.’ After presenting a practice slide each of the eight test slides was shown for a period of 5 sec. 
Accuracy of each reproduction is scored on a two point scale (0 or 1). On the basis of past experience with 
this task final scores were allotted as follows: where five or fewer of the designs were drawn accurately a 
score of one was given (1); six accurate designs (2); seven (3); and eight (4) (range: 1-4). 

10. Cutting a Cube: This traditional ‘imagery’ task was introduced by asking subjects to: ‘Imagine a cube 
3' x3" x3" and painted red all over Now imagine that it has been cut into 27 smaller cubes each 1"x1*x1" by 
making two equidistant vertical cuts and two equidistant horizontal cuts. Your task is to answer the 
following questions: “How many cubes have three faces painted red? How many cubes have two faces 
painted red? How many cubes have one face painted red? How many cubes have no faces painted red?" 
*These questions were printed in the test booklet and instructions were given for hands to be placed on the 
desk until subjects were ready to record their answers (range: 0-3) 

11. Rated Vividness of Imagery: Following the ‘cutting a cube’ task each subject was asked to assess the 
clarity of any images associated with carrying ıt out. by means of the seven-point rating scale used by 
Barratt (1953) (range: 0-6). 

12. Rated Controllability of Imagery: Each subject was asked to assess the ease with which the imagery 
aroused by the ‘cutting a cube’ could be manipulated. A six-point rating scale used by Barratt (1953) was 
employed (range: 0-5). 

13. Adjective Check List (1): A total of 46 adjectives were presented with the instruction to place a tick 
against all of those that described an aspect of the subject's personality. Six of these adjectives (dependable, 
efficient, mannerly, persistent, sociable, sympathetic) had been found by Segal & Nathan (1964) to be more 
frequently used in the self-descriptions of vivid imagers. The score was the total number of these six 
adjectives that were ticked by a subject (range: 0-6). ; 

14. Adjective Check List (2): At the conclusion of the last task subjects were instructed to underline the 
five adjectives, out of all those that had been ticked, which best described them. Scores were allocated for 
each critical adjective ticked (range: 0—5). 

15. Figure Preference Test: This 1s a non-verbal test constructed by Breskin (1968) for the measurement of 
. ‘rigidity’. From each of 15 pairs of figures subjects are required to choose the one preferred. Each pair 
contains a ‘simple’ and a ‘complex’ version of the same basic figure. Preference for simpler figures is said 
_ to imply ‘rigidity’ and the higher the score the more ‘rigid’ the person (range: 0-15). 
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16. Personal Reaction Inventory - (Neuroticism): Six of the items among the 45 in this ‘True/False’ 
inventory measured ‘neuroticism’ (Eysenck, 1958). The higher the score the more ‘neurotic’ the person 
(range: 0-6). 

17. Personal Reaction Inventory - (Introversion/Extraversion): A further six items were interspersed 
among the items in this inventory for the measurement of introversion/extraversion (Eysenck, 1958). The 
higher the score the more extraverted the person (range: 0-6). 

18. Personal Reaction Inventory - (Social Desirability): The 33 items from the Crowne & Marlow (1964) 
social desirability scale were included in this inventory. The higher the score the greater the need for 
approval or more specifically, the greater the tendency to respond to items in a socially desirable manner 
(range: 0-33). 

19. Ways of Thinking (Imagery): This measure comprises 39 items in an 86-item ‘true/false’ questionnaire 
developed by Allan Paivio and his colleagues (Paivio, 1971, p. 495) for the measurement of the extent to 
which ‘subjects habitually used imaginal and verbal modes of thinking’ (range: 0-39). ' 

20. Ways of Thinking (Verbal): The remaining 47 items from the questionnaire described above are keyed 
for the measurement of habitual verbal modes of thinking (range: 0-47). 

21. Incidental Memory (Page): This test was the last to be taken at the second testing session. The 
instructions were as follows, ‘Think back to the ‘‘Figure Preference Test” earlier in this booklet (DO NOT 
LOOK BACK). Try to recall the pairs of figures that appeared on the first page of the test. There were eight 
pairs altogether. Which were they and where were they located? All 15 pairs are shown below. Look through 
them and select the eight that were on the first page and indicate their position on that page by placing the 
appropriate identifying letters against their position numbers.’ The score obtained is the number of pairs 
correctly allocated to the first page (range: 0-8). 

22. Incidental Memory (Position): A second more precise measure was derived from the responses given 
to the test described above. Only pairs of items that were allocated to their correct positions on the first page 
were given credit (range: 0-8). 

23. Picture Memory: This test was used by Lumsden (1964). It requires a subject to look at a picture for 
3 min. At the end of this time it is removed and a second picture presented which is in some ways similar 
and in some ways different to the first. Five minutes are allowed in which to use the second picture as a 
basis for judging the truth or falsity of 30 statements comparing features of the first and second pictures. 
The score is the number of statements judged correctly (range: 0-30). 

24. Hidden Figures (Cf-1): This test is another version of the concealed figures test (Variable 3) and is 
reported by French, Ekstrom & Price (1963). It contains 16 items and is administered under time stress 
conditions (3 min). The score is the number of items answered correctly minus one-fifth of those answered 
incorrectly (range: 0-16). 

25. Cube Comparison (S-2): This is a 3 min 21-item spatial manipulation test adapted from Thurstone's 
cubes test (French et al. 1963). The score is the number answered correctly minus the number answered 
incorrectly (range: 0-21). 

26. Paper Folding (Vz-2): This is a 3 min, ten-item visualization test developed from Thurstone's punched 
holes test (French et al. 1963). The score is the number right minus one-fifth of the number wrong (range: 
0-10). 

27. Visual Imagery (Difference between vividness rated under eyes closed and eyes open condition): 
Subjects were asked to close their eyes and image a house with which they were familiar. Vividness 
ratings were made on the same seven-point scale used in the QMI (Variable 2). The same scene was then 
imaged with eyes open and another rating made. The score is the difference between the two ratings 
irrespective of direction (range: 0-6). 

28. Vividness of Visual Imagery (1): Five different scenes are imaged in turn. Each is rated for vividness 
on the same seven-point scale as before. The score is the sum of the ratings (range: 0-30). 

29. Vividness of Visual Imagery (2): One of the five scenes from the last test was selected for special 
attention. Subjects were instructed to, ‘Think back on your attempt to imagine the dazzling brightness of the 
sun at mid-day. (Pause). Did you find any tendency to screw up your eyes as you might do if you were 
actually looking at the sun? Underline whichever one of the following descriptions is most accurate: (a) 
Screwed up my eyes; (b) Slight tendency to screw up my eyes; (c) No tendency to screw up my eyes’ 
(range: 0-2). j 

30. Vividness of Auditory Imagery: Five sounds are to be imaged and a rating of each is made on the 
usual seven-point scale of vividness. The score is the sum of the ratings (range: 0-30). 

31. Vividness of Kinaesthetic Imagery: Five situations in which muscular sensations would occur are to be 
imaged and a vividness rating made as before. The score is the sum of the ratings (range: 0-30). 
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32. Habitual Mode of Thinking (Visual or Verbal): Subjects were given the following instructions: ' By 
opening your mouth or putting your tongue between your teeth, try to eliminate any small movements of 
your mouth or lips. Now mentally think of each of these words in turn: Bubble, Toddle, Putty. 

A. If you hear these words clearly in your *' mind's ear”, with no blurring or loss of the consonants then 
put a circle round A. 

B. If you tend only to ‘‘hear’’ the vowels, with the consonants blurred or cut out, '*u'le"' for Bubble, 

** o'le"' for Toddle and ‘‘u'y”’ for putty, then put a circle round B.’ Answers of ‘A’ are scored (2) and ‘B’ 
scored (1). 

33. Vividness of Imagery: Based upon the sum of scores on variables 28, 30 and 31 a total vividness of 
imagery score is calculated (range: 0-90). 

34. Controllability of Imagery (1): This is the slightly modified form (Richardson, 1969) of the test first 
constructed by Gordon (1949). The score is the number of scenes out of 12 which a subject can visualize 
clearly. Each ‘Yes’ response is given a credit (range: 0-12). 

35. Controllability of Imagery (2): The same test as above but scored: Yes (2); Unsure (1); No (0) (range: 
0-24). 

36. Revised Minnesota Paper Form Board (MPFB): Form DB óf this test was used with a 20 min time 
limit. The score is the number of items answered correctly minus one-fifth of those answered incorrectly 
(range: 0-64). 

37. ACER Speed and Accuracy: This test consists of two parts each with a time limit of 6 min. Part one 
comprises a set of 160 number checking items and part two, 160 name checking items. The score is the total 
number of correct responses (range: 0-320). 

38. Personal Experience Questionnaire (PEQ): This is the 28-item version of a test reported by Shor (1960) 
for the measurement of hypnotic susceptibility. The questions are all concerned with normally occurring (i.e. 
not under special conditions such as drug intoxication) altered states of consciousness (see Tart, 1969). 
Answers are of the yes/no type and after responding to the 28 items sübjects are asked to review the 
experiences which they claim to have had and to place one tick against any which had been fairly vivid and 
two ticks against any which had been especially vivid. The score is the number of items answered ‘yes’ plus 
the total number of ticks (range: 0-84). 

39. Imagery Response Times (Concrete Nouns): Two word lists were compiled. Each contained ten 
concrete nouns and ten abstract nouns matched for meaningfulness and frequency of occurrence (Paivio, 
Yuille & Madigan, 1968). Within each list concrete and abstract nouns were intermingled randomly. For 
administrative convenience subjects were paired and each acted as the experimenter for the other. The task to 
be described, was readily understood by subjects as by this time all of them were familiar with the general 
notion of mental imagery and with the details of their own imagery in particular. The instructions were for 
experimenters to record, with a stop watch, the time between stimulus presentation (when the word had 
been spoken) and response (when subject raised a finger to indicate that an image had been obtained). The 
instructions made clear the distinction between the nature of a verbal response and the nature of an image 
response. Subjects were required to indicate that they understood this distinction and that they would not 
respond until a true image, in any modality, had occurred. Practice trials were given before any words from 
the test proper. The score is the mean response time to the ten concrete nouns. 

40. Imagery Response Time (Abstract Nouns): The score is the mean response time to the ten abstract 
nouns used in the last test. 

41. Difference Between Response Times (Abstract nouns minus concrete nouns): The score is the mean 
response time to the ten abstract nouns minus the mean response time to the ten concrete nouns. 

42. Gaze Break: As each subject completed the last task he was taken aside and asked to answer three 
questions designed to produce reflective thought (e.g. ‘How many letters are in the word anthropology?’). 
Where a subject broke gaze to reflect upon the question the direction was noted. When the eyes are moved 
to the subject's left in response to each of the three questions a score of two is allocated. Evidence from 
Bakan (1969) suggests that habitual left movers are more likely to be habitual imagers. Habitual right movers 
may be more inclined to be verbalizers and are scored zero. A score of one is given where a subject 
responds inconsistently (range: 0—2). 


Procedure Scores on each of these 42 variables were obtained from tests administered in September 1971, 
May 1972 and September 1972 At the first of these testing sessions 427 first year psychology students at the 
University of Western Australia provided data on variables 1 to 5. Tests were administered by the writer and 
other staff members to groups containing between 15 and 25 students. Discussion of the tests and their 
purpose followed this session and most students heard a lecture on mental imagery and received reference 
reading on the topic as part of their normal course work. 
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About seven months later all those who had expressed themselves as willing to take part in a second 
testing session were sent letters inviting them to attend at one or other of five suggested times. Testing took 
place in groups of 20 or less and complete data, on variables 6 to 22, were obtained for a total of 81 


subjects. 

At the third session the testing took place in groups of 15 or less. Complete data, on variables 23 to 42, 
were obtained for a total of 60 subjects. The administration of all tests at the second and third sessions was 
undertaken by the writer. À token amount of $A1.00 was paid to each subject for attendance at each of the 


two later sessions. 


Results and discussion 


The evidence relating to each of the questions asked earlier under the heading of aims is 
discussed below. 


Imagery vividness 

1. Separate principal component analyses were undertaken on a 42x42 matrix of correlations 
for males, females and total persons. Thirteen factors with eigenvalues equal to or greater than 
1-0 were rotated to the Varimax criterion in the male sample and accounted for 88 per cent of 
the common variance. Using the same criterion 14 factors were found in the female sample, 
accounting for 90 per cent of the common variance. Nine rotated factors were readily matched 
and were confirmed by a similar analysis on the total sample of persons (see Table 1). 
Interpretation of four of these factors was aided by the marker variables: extraversion (17), 
speed (37), verbal ability (7), and spatial vividness (36). Two other factors were interpreted in 
terms of the processes and the content related to the task requirements: latency of imagery (40) 
and position memory (21). The factor called spatial control might equally well be called 
neuroticism (16) if this test is accepted as a marker for the well-established factor of this name. 
However, its interpretation as a measure of spatial control was suggested by the relatively high 
loading of Necker Cube Fluctuations (8) which has been found in association with neuroticism in 
previous studies of controllability (see Richardson, 1972). Further discussion of the imagery 
control and spatial control factors will be delayed until the questions relating to this general 
dimension of imagery control are considered. 

The remaining factor of imagery vividness is clearly distinct from all other factors. This is 
particularly obvious when the Betts QMI (2), the imagery measure from the Ways of Thinking 
questionnaire (19) and overall measure of vividness of imagery (33) are taken as the defining 
tests. That spatial ability tests load a separate orthogonal factor from self-report measures of 
imagery vividness was first established by Downie (1967). Replications of this finding, with an 
extension to show that verbal ability is independent of spatial and of imagery abilities, have been 
made by Di Vesta, Ingersoll & Sunshine (1971) and Paivio (1971) as well as by Richardson in the 
first and second studies reported here. 

Self-report measures of imagery vividness show a high degree of independence from other 
categories of cognitive and temperament tests. ` 

2. The three tests of imagery vividness discussed above differ in one or more ways from each 
other yet each has high loadings on the same factor. Each was administered at a different testing 
session. In the QMI (2) the 35 items to be imaged were read to the subjects. The Ways of 
Thinking questionnaire (19) required subjects to answer ‘true’ or ‘false’ to printed questions like, 
‘By using mental pictures of the elements of a problem, I am often able to arrive at a solution’. 
The correlation between these two tests for male and female samples analysed separately was 
not statistically significant but for the combined samples reached an acceptable level (r — 0-30; 
P<0-01). 

Finally, the Vividness of Imagery questionnaire (33) was presented in booklet form and though 
the rating scale was the same as for the QMI (2) only items for the measurement of visual, 
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auditory and kinaesthetic modalities were employed and not all of these were the same as those 
used in the QMI. Table 2 shows that despite a 12 month interval, a different mode of 
administration, a different length of test and some differences in item content highly significant 
correlations were obtained between them for both male and female samples. 

The conclusion to be drawn from these findings is that self-reported imagery vividness remains 
relatively stable over time even when measured by somewhat different tests. 

3. Assuming that none of the subjects in these studies had deliberately randomized or in some 
other way deliberately distorted his responses to the request for a vividness rating of voluntarily 
produced images, then there are at least three types of unwitting bias that might influence the 
ratings that are made and consequently the final score obtained. The first concerns a subject's need 
to appear in a favourable light. If he believes that vivid imagery is desirable either because he 
suspects that the experimenter values it or because it is thought to be a good thing in his 
subgroup, any uncertainty in making a rating is likely to be resolved by making a slightly more 
vivid rating rather than a slightly less vivid one. The second source of possible bias concerns the 
subject's general level of familiarity with internal events ranging from such simple physiological 
states as fatigue to more complex psychological states like those associated with aesthetic 
enjoyment. Scores on a scale of imagery vividness might represent little more than an index of 
general familiarity with internal events, with those who believe themselves to be relatively 
familiar with such events being biased toward the making of higher ratings of vividness and 
those who are not, being biased in the opposite direction. A third source of unintentional bias 
could occur if a subject employed a very conservative criterion in assessing whether an image 
was ' very clear and comparable in vividness to the actual experience" or altenatively a very 
liberal criterion. Either way the final scores obtained by such subjects would not be a ‘true’ 
reflexion of ‘actual’ imagery vividness but a reflexion of the criterion employed in making a 
rating decision. 


(a) Need: Contrary to the finding of Di Vesta et al. (1971) the Crowne & Marlow (1964) measure 
of social desirability (18) does not load highly on the imagery vividness factor. However, an 
examination of Table 2 shows that for the male sample only, it has a correlation of 0-43 
(P< 0-05) with the QMI (2) measure of imagery vividness but that it does not have a correlation 
with any other measure of imagery vividness that even approaches an acceptable level of 
significance. A similar finding of a significant correlation between social desirability (18) and the 
QMI (2) for males but not for females was obtained in the second study referred to at the 
beginning of the method section (r= 0-36, P< 0-05). Again, in the third study in which male 
naval recruits served as subjects, these two variables showed a significant positive correlation 
(r — 0-25, P « 0-05). It is of special interest that social desirability (18) was uncorrelated with 
habitual imagery scores on the Ways of Thinking questionnaire (19) in any of these three studies. 
The evidence collected so far suggests that the QMI is especially vulnerable to social 
desirability response set in male samples but not in female samples and that the Ways of 
Thinking questionnaire is not susceptible to these influences in either. 


(b) Familiarity: The PEQ (38) was employed as a measure of the extent to which a subject was 
familiar with a variety of different subjective experiences. Loadings of —0-47 in the first study 
(see Table 1) and of 0-36 in the second study were obtained for this variable on the imagery 
vividness factor, but only for females. The difference in sign is of no consequence; in the first. 
study all the tests loading the imagery vividness factor for the female sample had negative 
loadings while in the second study the comparable loadings were positive. An examination of 
Table 2 shows that for the females the PEQ (38) correlates significantly with three of the six 
imagery vividness measures including the QMI (2) but not with the Ways of Thinking 
questionnaire (19). In the male sample only one of the six correlations is at an acceptable level 
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of significance and that is between the PEQ (38) and a measure of visual imagery vividness (28). 
Of the four imagery vividness measures used in the second study only the QMI (2) correlated 
significantly with the PEQ (38) among the females (r= 0-45; P< 0-01). None of the correlations 
between the PEQ (38) and any of the four imagery vividness measures approached an acceptable 
level of significance in the male sample. 

Though females do not differ significantly from males in their familiarity with internal events it 
is possible that they are more biased in the direction of rating their images as vivid when they 
have greater familiarity with other classes of subjective experience than when they have not. 
For men, overall familiarity with internal events is unconnected with rating of imagery 
vividness. The Ways of Thinking questionnaire (19) appears to be as free from this ‘familiarity’ 
response bias as it was free from the ‘need’ for approval response bias. 


(c) Criterion: The second study is the only one in which a measure of category width was 
included. Though it is known (Tajfel, Richardson & Everstine, 1964) that not all tasks designed 
to measure category width are measuring the same response style the one chosen in this study 
was the Estimation Questionnaire (Pettigrew, 1958). A high score is indicative of broad 
categorizing preferences, i.e. the tendency to include or accept into a target class (e.g. vivid 
imagery) an instance of this class about which the respondent is uncertain. For the females but 
not for the males it was found that scores on the Estimation Questionnaire correlated 0-45 

(P< 0-05) with rated vividness of imagery (11) and 0-39 (P< 0-05) with the imagery measure from 
the Ways of Thinking questionnaire (19) but only 0-18 (n.s.) with the QMI (2). Thus some 
evidence exists, at least for the females on the imagery rating task that high and low scores on 
self-report measures of imagery may result, in part, from a response style that orientates the 
subject either to accept a particular image as coming within the class of vivid images when its 
‘actual’ level of vividness is somewhat lower than this or to reject the ‘same’ image as coming 
within the class of vivid images. Why the Ways of Thinking questionnaire on habitual imagery 
(19) should be influenced by this response set but not the QMI (2) is less comprehensible. 

It is clear that biases produced by the need to give socially approved responses, by the degree 
to which the type of observation required is familiar and by the stringency of the criterion that a 
subject employs in deciding whether a particular image is to be categorized (rated) as more or 
less vivid, may all influence the final score obtained on self-report measures of imagery. The 
evidence just given suggests the type of task (test) and subject (sex) variables that may be 
especially susceptible to the operation of particular kinds of bias. 

4. No evidence was found which would justify the use of any objective (performance) test as 
an alternative to self-report tests in the measurement of overall imagery vividness. In particular 
tests of spatial ability appear to be completely unacceptable as measures of memory imagery 
vividness. Of the measures that loaded the imagery vividness factor but which were not directly 
concerned with imagery the Adjective Check List (13 and 14) is of most interest. Segal & 
Nathan (1964) had found that six adjectives (dependable, efficient, mannerly, persistent, sociable 
and sympathetic) were used more often in the self-description of vivid imagers than in weak 
imagers. In the first study this result was found for the females only (see Table 1). The 
adjectives suggest Eysenck’s description of the stable extravert and further investigation of this 
possibility should be undertaken. 

Clinical psychologists who have attempted to find objective measures of imagery vividness 
have been concerned primarily with vividness in the visual modality (see Rimm & Bottrell, 1969; 
Danaher & Thoresen, 1972; Rehm, 1973). Though the first study found no support for Rehm’s 
(1973) suggestion ‘that the latency of report of ‘‘best image” attainable may have some 
potential as an adjunct measure of visualization ability or vividness of visual image’ (a. 270) it 
did revive interest in the possibility of using ‘breathing patterns’ (see review by Richardson, 
1969, pp. 72-73) for the identification of those who are most likely to be habitual visual imagers 
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as opposed to those who are habitual verbalizers. Though habitual mode of thinking (32) has its 
highest loading on verbal ability for the female sample (see Table 1) it also loads for this sample 
on imagery vividness. When men with the highest and lowest scores on the visual imagery 
modality of the QMI (2) are compared a significant difference in the habitual mode of thinking 
(32) variable (P< 0-05) was obtained. A similar trend was found among the women (P< 0-10). 


Imagery control 


1. As reported previously (Richardson, 1972) scores on the Gordon test of imagery control (34 
and 35) are not independent of scores on tests of imagery vividness. This conclusion is 
supported by the factor analytic results shown in Table 1 and from the correlational analysis 
shown in Table 2. The positive association of scores on imagery vividness and imagery control is 
found in both male and female samples. Apart from overlap in the dimensions of imagery 
measured by the Gordon test and all but one of the tests of imagery vividness the imagery 
control factor is independent of all other cognitive and temperamental factors shown in Table 1. 

2. The new self-report test of imagery control (5) has a loading of 0-63 on the imagery control 
factor when the analysis is undertaken with the combined sample of men and women and a 
loading of 0-68 for this factor when the analysis is undertaken on the male sample only (see 
Table 1). For the male sample this new test (5) correlates with the Gordon test (35) 0-45 
(P< 0-05) though each test was administered on a different occasion and in a different setting. It 
is noteworthy that this new test (5) is uncorrelated with any test of imagery vividness (see Table 
2) which suggests that it may serve as an independent measure of imagery control, at least with 
a male sample. 

3. Neither ‘need’ for approval (18) nor ‘familiarity’ with internal events (38) have significant 
loadings on the imagery control factor (see Table 1). However, it is of interest that ‘familiarity’ 
with internal events (38) correlates with the new test of imagery control (5) for the female 
sample only, 0-39 (P< 0-05). It will be remembered that a similar finding for females, but not for 
males, was reported for imagery vividness. 

4, Previous research reviewed by Richardson (1972) clearly indicated that control of Necker 
cube fluctuations (8) was associated with the ability to control imagery as measured by the 
Gordon test (34 and 35) supplemented by interview data. When used as the sole source of 
information on imagery control the statistical reliability of the association with Necker cube 
fluctuations is reduced. Table 2 shows no significant correlation between these two variables. 
However, the new test of imagery control (5) shows a correlation with Necker cube fluctuations 
(8) of 0-39 (P< 0-05) for the female sample. It is also of interest that for this sample but not for 
the males a correlation of 0-52 (P< 0-01) was obtained between the Concealed Figures test (3) 
and Necker cube fluctuations (8). Again, for the female sample, but not for the male, rated 
controllability of imagery (12) correlated 0-47 (P< 0-01) with Necker cube fluctuations (8). 

As might have been expected from previous research findings (Richardson, 1972) the rigidity 
implied by high scores on Neuroticism (16) is negatively correlated with the flexibility implied by 
high scores on the Necker cube task (8). Though the correlations are not significant when males 
and females are treated separately, a correlation of —0-29 (P< 0-01) is obtained for the combined 
sample. 

As was suggested by Richardson (1972) the weight of the evidence suggests that imagery 
control may be one manifestation of the more general dimension of adaptive flexibility which has 
been identified by Frick, Guilford, Christensen & Merrifield (1959) as one factor in a large scale 
study of ‘rigidity’ tests. 


Conclusion 


Though the conceptual definition of a memory image as a quasi-perceptual experience has been 
recommended for general adoption the appropriate operational definitions of vividness and 
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controllability have proved more difficult to establish. At this stage of research into the problems 
of measuring memory imagery the following tentative conclusions have been drawn. They are 
placed in approximate order with the least tentative conclusions first and the most tentative last. 

1. Self-report measures must necessarily serve as the initial criterion against which other more 
objective behavioural measures (e.g. (Deckert, 1964) and physiological measures (e.g. Gale, 
Morris, Lucas & Richardson, 1972) can be validated. 

2. To achieve reliable and valid ratings on self-report scales of imagery vividness and 
controllability it will be necessary to: (a) improve the scales, e.g. by reducing the ambiguity that 
still exists in many of the items; (b) control the known influence of response sets, e.g. by 
subject selection or by partialing out their effects. 

3. For those researchers who wish to test hypotheses in which individual differences in 
imaging abilities are involved and who do not wish to wait until completely satisfactory 
measuring instruments have been constructed the following suggestions are offered: (a) select a 
female sample; (b) use the Ways of Thinking (19) questionnaire as a crude measure of imagery 
vividness on the grounds that it is free from at least two out of the three response biases 
examined in this paper (female sample only). It loads the imagery vividness factor and it is 
reasonable to assume that those who are habitual imagers will also have developed a capacity to 
generate more vivid images; (c) use the Necker cube fluctuations (8) as a measure of imagery 
control. Differences in the ability to control rate of fluctuation on this task correlate with scores 
on the new test of imagery control (female sample only). Past research (Richardson, 1972) has 
consistently implicated this test in the measurement of imagery control and it has the advantage 


of being clearly understood by the subject and objectively scored by the psychologist. 
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Time of day effects in school children’s immediate and delayed recall of 
meaningful material 


Simon Folkard, Timothy H. Monk, Rosamund Bradbury and Joanna Rosenthall 





Independent groups of school children were read a story at either 09.00 or 15.00. The results from those 
children receiving an immediate memory test were in line with previous studies, children who were read the 
story at 09.00 obtained higher immediate memory scores than those who were read the story at 15.00. 
However, when the memory test was delayed by one week, children who were read the story at 15.00 
obtained higher memory scores than those who were read it at 09.00. This superionty of delayed recall 
following 15.00 presentation was unaffected by the time of day at which the recall was made, and is 
consistent with previous studies on the effect of arousal on long-term memory. It is pointed out that the 
current practice of teaching most academic matter in the morning is based on early studies, and that these 
studies failed to take account of the interaction now known to exist between arousal level at presentation 
and the delay of recall. 





A large number of studies have shown performance on tests of short-term memory (STM) to be 
superior in the morning to the afternoon. Much of the early work in this area (reviewed by 
Freeman & Hovland, 1934) was carried out on school children (e.g. Winch, 1911, 19124, b; 
1913a, b; Gates, 1916) and was clearly aimed at laying down guidelines for the optimal 
arrangement of school timetables. Thus Gates (1916) concludes that *in general the forenoon is 
the best time for strictly mental work. . . while the afternoon may best be taken up with school 
subjects in which the motor factors are predominant' (p. 149). This conclusion is drawn from 
Gates' finding that tasks involving a large STM component were performed best in the morning. 
Gates suggests that this afternoon inferiority of tasks involving STM is due to ‘mental fatigue’. 

Although more recent studies in this area have tended to ignore this early work, their findings 
have been essentially the same. Blake (1967, 1971) found performance on a range of tasks not 
involving STM to increase over the day and to parallel fairly closely the circadian rhythm in 
body temperature. He also found digit span, a ‘classic’ test of STM, to show virtually the 
opposite time of day effect and to be highest in the morning. This latter finding has been 
supported by the studies of Baddeley, Hatter, Scott & Snashall (1970) and Hockey, Davies & 
Gray (1972), both of which found performance on STM tasks to be superior in the morning to 
the afternoon. In reviewing this work Hockey & Colquhoun (1972) suggest that performance on 
tasks involving the ‘immediate processing’ of information increases over the day while 
performance on STM tasks decreases. This suggestion was extended by Folkard (1975) who 
found performance on two tasks of logical reasoning that involve both STM and ‘immediate 
processing’ to show a compromise between the functions relating STM and immediate 
processing performance to time of day. Performance speed on these tasks increased from 08.00 
to 14.00 and then decreased again over the rest of the day. In a further study Folkard, Knauth, 
Monk & Rutenfranz (1976) systematically varied the STM component of a visual search task in 
an experimental shift work situation. With a low STM load performance rose over the day and 
was highly correlated with body temperature. With an intermediate STM load performance 
showed no correlation with temperature, while with a high STM load performance was negatively 
correlated with temperature. 

While these more recent studies have confirmed the findings of the earlier work, the 
theoretical interpretation has differed markedly. Rather than being more ‘mentally fatigued’ in 
the afternoon, we are now thought to have a higher basal level of ‘arousal’ (Kleitman, 1963; 
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Colquhoun, Blake & Edwards, 1968 a, b). This interpretation is fairly generally accepted and is 
based on the finding that both body temperature and performance on ‘immediate processing’ 
tasks increase through the day. It has also proved possible to account for the afternoon 
inferiority of STM in terms of this arousal theory since a number of studies have indicated that 
STM is impaired under conditions of high arousal (e.g. Kleinsmith & Kaplan, 1963; Walker & 
Tarte, 1963; McLean, 1969). However these studies have also shown that high arousal benefits 
long-term memory (LTM). Indeed Craik & Blankstein (1975), in a recent review of the memory 
and arousal literature, conclude that the beneficial effect of arousal on LTM has been found 
rather more consistently than its detrimental effect on STM. In view of this it might be expected 
that LTM should be better following afternoon, rather than morning presentation. 

The authors have been able to find only three studies that have attempted to determine the 
relationship between time of day of presentation and LTM, or delayed recall, despite its obvious 
practical importance. The earliest of these is that of Laird (1925) who tested recall immediately 
and 40 min after subjects had (apparently) read a short passage of prose. It is unclear as to how 
the recall was made or scored. Nevertheless the immediate recall scores showed the typical 
decrease over the day, with the suggestion of a subsequent rise after about 8 p.m. The delayed 
recall scores showed almost exactly the same time of day effect and thus fail to support the idea 
that afternoon presentation may lead to superior LTM. However both the immediate and 
delayed recalls were made by the same subjects following a single presentation of the material. 
Thus the immediate recall may well have acted as a second presentation and swamped any time 
of day effect in delayed recall. Laird also confounded effects due to time of day of presentation 
with those due to time of day of delayed recall since both presentation and delayed recall took 
place at approximately the same time of day. This latter criticism can also be levelled at 
Baddeley et al. (1970) although the nature of the task they used made it impossible for them not 
to confound these factors. 

Baddeley et al. used the Hebb (1961) technique of repeating the same sequence of digits every 
third trial in a memory test involving the immediate recall of sequences of nine random digits. 
The difference between the recall scores for repeated and non-repeated trials was assumed to 
reflect the LTM component of this task. Their results showed the normal morning superiority for 
unrepeated lists (i.e. STM), while the LTM measure showed a slight, but non-significant, 
afternoon superiority. Baddeley et al. themselves point out that the enhancement of LTM by 
arousal is normally only found after delays of 20 minutes or more, and that it is debatable as to 
whether the Hebb technique really measures LTM. It is therefore unclear as to whether these 
results are at all pertinent to the idea that afternoon presentation may lead to superior LTM. 

Finally, Hockey et al. (1972) used a free recall paradigm to test immediate, and five-hour 
delayed, retention following presentation at either 23.00 or 07.00. While immediate recall was 
again found to be superior following morning rather than evening presentation, their main 
interest was in the amount of forgetting that took place over the five-hour delay period. The 
results indicated that forgetting was greater following the morning presentation than following 
evening presentation. In view of this interest in the forgetting aspect, Hockey et al. also used a 
design in which both immediate and delayed tests of the same list were given to the same 
' Subjects. There was some suggestion in their data that delayed recall was superior following 
evening presentation. However, Hockey (personal communication) points out that this was not 
significant, and that such a comparison confounds both differences in recall in the immediate 
test, and potential time of day effects in the efficiency of retrieval from LTM. 

The suggestion that the afternoon presentation of material should result in superior LTM is 
thus both predicted from the theoretical interpretation of the time of day in STM results, and 
supported to some extent by the available evidence. However this evidence is both weak and 
confounds the time of day of presentation with that of recall. It is also true that, at least in the 
more recent papers, somewhat unrealistic but rigorous memory tasks have been used in both the 
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STM and LTM studies of time of day effects. The present study was thus designed to look at 
time of day effects in the immediate and delayed recall of realistic or meaningful material — a 
short story. Further, it was designed in such a way as to avoid confounding time of presentation 
with that of delayed recall, and to allow the comparison of delayed recall made at the same or 
different time of day to the presentation. This latter point is particularly important since it is well 
known that many physiological functions show a marked circadian rhythm (Conroy & Mills, 
1970) and that there is a ‘state dependent learning effect’ in that LTM is superior under the 
same physiological state as presentation than under a different one (e.g. Goodwin et al. 1969; 
Bustamante ef al. 1970). It is quite possible that delayed recall might be superior at the time of 
day at which presentation took place to other times of day. 


Method 
Subjects 


The subjects were 62 male and 68 female pupils from six parallel tutor groups of a large comprehensive 
school. They had an age range of 12 years 5 months to 13 years 4 months (mean = 12 years 11 months). 
They were selected from the 155 pupils making up the six tutor groups on the basis of their having reading 
comprehension ages, as measured by the Daniels & Black ‘standard reading test’, within a range 11 years 6 
months to 13 years 6 months. The groups thus selected varied in size from 18 to 24 pupils, and all the groups 
had mean reading comprehension ages of 12 years 7 months (+2 months) and a range of 11 years 6 months 
to 13 years 6 months. 


Materials 


A single story, ‘A New Horse’ by Lo-Johannson (1968), was chosen in collaboration with the school as 
being both interesting for these children and unknown to them. In order to ensure a consistent presentation 
to the groups it was recorded on magnetic tape in a clear female voice. The story was approximately 2000 
words long and the recording lasted about 12 min. A multiple choice questionnaire containing 20 questions 
was printed. Each question was followed by four plausible alternatives, only one of which was correct, 
and the questions were in the same order as the information to which they related. The questions ranged 
from those of trivial detail to those demanding a true comprehension of the story. 


Design and procedure 


A between-subjects design was used. Three of the groups were played the story at 09.00 while the other 
three were played it at 15.00 Before the presentation all the groups were told to listen carefully to the story 
since they would be asked some questions about it. However, none of the groups were told when these 
questions would be asked. Following the 09.00 presentation one group was given an ‘immediate’ recall test 
at 09.15, after the other two groups had returned to their normal school activities. On the same day of the 
following week one of this remaining pair of groups was given the questionnaire at 09.15, the ‘same’ time of 
day, and the other at 15.15, the ‘different’ time of day, to presentation. A similar procedure was used for the 
groups receiving the 15.00 presentation. Thus one group was tested ‘immediately’ at 15.15 while the other 
two groups were tested a week later, one at 09.15 and one at 15.15. 

The use of this design meant that approximately twice as many children were given a delayed test as were 
given an immediate test. (This was necessitated by the fact that it is clearly impossible to give an immediate 
recall test at a ‘different’ time of day.) This design also involves assuming that, given a one week delay, a 
variation of +6 h to this delay (i.e. in the ‘different’ time of day groups) will only effect recall level to a 
negligible extent. 

During the recall periods the subjects were each given the multiple choice questionnaire and asked to 
complete it as quickly as possible. They were instructed to guess if they could not remember the answer to a 
question, and to ensure that they chose one of the alternatives for each of the 20 questions. Al] subjects 
complied with these mstructions. They were allowed as long as was necessary to complete the questionnaire, 
and no subject took longer than 15 min. 


Results 


The questionnaires were scored simply in terms of the number of questions answered correctly, 
no attempt being made to correct for guessing. Since preliminary analysis indicated that there 
were no significant sex differences this factor was ignored in the subsequent analyses. 
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Table 1. Mean number of correct answers following 09.00 and 15.00 presentation for immediate 
and one week delayed recall 








Recall 
Time of rn ZR 
presentation Immediate Delayed (1 week) 


A A———————— ———————— —— — J—— —— 
09.00 16.8 12:9 


(n = 22) (n = 42) 
15.00 15-2 14-0 
(n= 20) (n = 46) 


eee 
——.00—.—-—soooas—o>ooowc—“a seo 


Table 2. Mean number of correct answers under delayed recall for those groups recalling at the 
same or different time of day to that at which the story was presented 











Delayed recall 
Time of ‘Same’ ‘Different’ 
presentation time of day time of day 
09.00 13-2 12-7 

(n = 18) (n = 24) 
15.00 14-1 14-0 

(n = 23) (n = 23) 








Two main lines of analysis were pursued. In the first one no account was taken of the time of 
day at which the delayed recall took place. Thus this line of analysis was concerned only with 
the time of day of presentation and its effect on immediate and delayed recall. The mean number 
of correct answers made under immediate and delayed recall following presentation at either 
09.00 or 15.00 is shown in Table 1. The use of independent ¢ tests indicated that 09.00 
presentation led to superior immediate recall (t= 2-16, d.f. — 40, P< 0-025) but inferior delayed 
recall (t — 1-94, d.f. 2 86, P« 0-05) compared to 15.00 presentation. Analysis of variance 
confirmed the presence of a significant interaction between the time of presentation and delay of 
recall (F= 7.77, d.f. = 1, 126, P< 0-01). It also indicated that there was a significant main 
effect of delay (F= 27-27, d.f.=1, 126, P< 0-001). . 

The second line of analysis pursued was concerned with whether delayed recall at the ‘same’ 
time of day as presentation was superior to that at the 'different' time of day. The mean number 
of correct answers made by the four groups given delayed recall is shown in Table 2. Analysis 
of variance indicated that there was no evidence that ‘same’ time of day recall was superior to 
‘different’ time of day recall (F< 1, d.f. = 1, 84, P» 0-25), or that this factor interacted with 
time of day of presentation (F< 1, d.f. = 1, 84, P» 0:25). 


Discussion and conclusions 


The results of the immediate recall tests would appear to support the findings of earlier studies 
that immediate recall is superior in the morning. They can also be considered to extend the 
findings of at least the more recent studies in that this superiority was found using meaningful 
material rather than lists of random digits or words. However, it should be pointed out that these 
results are not entirely consistent with the studies relating arousal to STM. Even in the 
immediate recall situation a delay of approximately 15 min occurred between an item being read 
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in the story, and it being tested on the questionnaire. The results of Kleinsmith & Kaplan (1963) 
and Walker & Tarte (1963) indicate that this delay is approximately that at which a cross-over 
normally occurs between recall scores obtained following high arousal presentation and those 
following low arousal presentation. Thus in the present study it could be predicted that no 
difference should be found in the immediate recall data. However, in a further study (Folkard, in 
preparation) evidence was obtained that suggests that the detrimental effect of arousal found for 
some STM tasks may be due to a reduction in the subvocal rehearsal that subjects typically 
engage in on those tasks. It seems improbable given the nature and quantity of the material to be 
remembered, that the subjects in the present study adopted a subvocal rehearsal strategy. They 
may, however, have used an alternative form of maintenance rehearsal (Craik & Lockhart, 1972) 
that is more appropriate to remembering a story, and, like subvocal rehearsal, is disrupted or 
inhibited by arousal. If this is the case, the presence of a morning superiority of ‘immediate’ 
recall in the present study could be accounted for by assuming that this alternative form of 
maintenance rehearsal aids retention over a rather longer period of time than subvocal rehearsal. 

The presence of an interaction between time of day of presentation and delay of recall is 
entirely consistent with the arousal and memory literature. So too is the superiority of delayed 
recall following afternoon presentation. However, this latter finding casts some doubt on the 
generality of Gates’ (1916) conclusion that ‘mental work’ should be confined to the morning. It 
is clearly impossible to recommend a reversal of the current school practice of teaching the 
major part of the academic matter in the morning on the basis of a single study. However, it is 
equally clear that many of the studies on which this practice appears to be based have made the 
erroneous assumption that what is good for immediate recall will be good for delayed recall. 
Time of day effects in performance are far more complex and task specific than was recognized 
in the early studies in this field. 

From a practical point of view it is also important that the superiority of delayed recall 
following afternoon presentation was found to be independent of the time of day at which recall 
took place. Thus the present study yielded no evidence in favour of the suggestion that time of 
day constitutes a ‘state’ in the state dependent learning sense. This failure to find a state 
dependent learning effect may well be due to the relatively small difference between the 
physiological states that exist at 09.00 and 15.00. Alternatively it may be due to the use of what 
was essentially a recognition test, since there is some evidence that state dependent learning 
effects are more marked when recall rather than recognition measures are used (Goodwin et al. 
1969). The use of more extreme times of day, e.g. 04.00 and 20.00, or more sensitive memory 
tests, might give rise to a significant state dependency effect. However, the present results do 
suggest that this effect would be relatively small in comparison to that of the time of day of 
presentation. 

Finally, it is perhaps worth pointing out that the differences associated with time of day of 
presentation were about 10 per cent in the case of immediate recall and 8 per cent in the case of 
delayed recall. As such they are similar to those obtained when people are limited to three 
hours sleep (Wilkinson, Edwards & Haines, 1966), or given the ‘legal limit’ of alcohol for 
driving (Drew, Colquhoun & Long, 1959). Clearly they should not be ignored. 
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Hypnotic susceptibility and personality: The consequences of diazepam and 
the sex of the subjects 


H. B. Gibson, M. E. Corcoran and J. D. Curran 


It was suggested by Gibson & Curran (1974) that the rather complex relationships found to obtain between 
hypnotic susceptibility and the personality parameters of extraversion and neuroticism might be understood 
by considering neuroticism as a moderator variable (as had been suggested by Furneaux & Gibson, 1961). 
They made the hypothesis that if a tranquillizing drug were administered the operative level of neuroticism 
would be decreased, and as a consequence the level of susceptibility of neurotic extraverts would be raised, 
and that of neurotic introverts lowered. 

This study reports the test-retest data on a sample of 71 subjects on the Stanford Hypnotic Susceptibility 
Scales, half of whom were retested with diazepam and half with nicotinic acid. The hypothesis was 
confirmed and additional data are given on the drug/placebo effects on each item of the scale. The 
significance of drugs on different aspects of hypnotic susceptibility in relation to personality is discussed. 





The relationship between personality variables, as measured on the Eysenck Personality 
Inventory (EPI, Eysenck & Eysenck, 1964) and hypnotic susceptibility, as measured on the 
Stanford Hypnotic Susceptibility Scale, Form A (SHSS, Weitzenhoffer & Hilgard, 1959) was 
tested on a sample of 43 subjects by Gibson & Curran (1974). This study aimed for a more 
sophisticated replication of Furneaux & Gibson's (1961) study and found the same general 
relationships obtaining between hypnotic susceptibility and measures of extraversion, 
neuroticism and ‘lying’. This work was then repeated on a further example of 45 subjects by 
Gibson & Corcoran (1975) which confirmed the relationships but demonstrated that the 
significant negative relationship between hypnotic susceptibility and ‘lying’ existed only for the 
male subjects, a sex difference not previously suspected. 

Gibson & Curran (1974) suggested that the somewhat complex relationships between the 
variables studied might be accounted for by regarding neuroticism as a moderator variable as 
had been suggested by Furneaux & Gibson (1961). While the least susceptible subjects were the 
neurotic extraverts the most susceptible tended to be the neurotic introverts. In that half of the 
sample low on neuroticism the relationship between extraversion and hypnotic susceptibility was 
reversed. Gibson & Curran hypothesized that, were a tranquilizing drug used in a future 
experiment, with the lowering in the operative level of neuroticism, its supposed moderator 
effect would be reduced. The present study was designed to test this hypothesis. 

The evidence concerning the effect of drugs on suggestibility is very conflicting. For example, 
while Hull (1933) used alcohol balanced by a placebo, and concluded that it had little measurable 
effect on heterosuggestibility, earlier workers like Pettey (1913) and Moll (1890) were convinced 
that such drugs as morphine, alchohol, ether, chloroform, chloral, cannabis and various bromides 
definitely increased hypnotic susceptibility and such assertions are confidently repeated by 
modern writers (e.g. Petrie & Stone, 1969). However, such statements are largely based on 
impressionistic clinical work, and as Vingoe (1973) has pointed out in his excellent review of the 
drug literature: ‘That many people still speak of certain drugs as hypnotics because they induce 
a sleeplike state is understandable (since certain drugs and the typical hypnotic induction are 
thought to induce a form of sleep) but also confusing and unfortunate in that many may be led 
to believe that the drug-induced state and the hypnotically induced state are, if not one and the 
same thing, then at least very similar’ (p. 96). 

In an experimental context Sjoberg & Hollister (1965) report that some, though not all, 
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psychotomimetic drugs produce an average enhancement in primary suggestibility closely 
comparable to that produced by hypnotic induction. Hull (1933) reports an experiment by 
Baernstein in which 19 subjects were given the postural sway test in a balanced design involving 
scopolamine hydrobromide and a placebo. It was found that the drug did not increase the 
susceptibility of those initially resistant, but it produced a significant increase in the measured 
sway of those already susceptible. These results find a parallel in the study by Eysenck & Rees 
(1945) where subjects were administered sodium amytal or inhaled nitrous oxide. It should be 
noted that Hull's postural sway test is about the best single predictor of hypnotic susceptibility 
to have emerged from the literature of experimental hypnosis, but it has its special limitations in 
drug research, as will be discussed later. f 

While studies such as these compare the effects of drugs with those of hypnotic induction 
upon an individual’s suggestibility, or examine the change in suggestibility resulting from an 
administration of a drug, few studies, as Jones (1972) and Vingoe (1973) indicate, convincingly 
examine the effects of drugs on hypnotic susceptibility. While measures of primary 
suggestibility and hypnotizability are highly related, it is less than meaningful to infer that since 
certain drugs alter primary suggestibility their effect will be the same for hypnotizability. There 
seems no good reason, for example, to suppose that since LSD enhances primary suggestibility 
to a degree comparable with hypnotic induction (Sjoberg & Hollister, 1965), the combination of 
these two treatments would in any way be additive. 

The hypothesis being tested by the present study is that a tranquilizing drug will either 
increase or decrease hypnotic susceptibility according to the personality characteristics of the 
individual subject. Such a view may further understanding of the conflicting evidence which is 
supplied by the relevant experimental literature. 


Method 
Subjects 


The subjects were 32 males and 39 females, all undergraduate students on an honours psychology course. 
They had previously acted as experimental subjects in the studies previously reported (Gibson, & Curran, 
1974; Curran & Gibson, 1974; Gibson & Corcoran, 1975). ; 


Materials 


All subjects had completed the EPI, Forms A and B, and had been tested on the SHSS, Form À, as 
previously reported. In the present study they were given the parallel form, SHSS Form B, the slight 
technical modifications introduced by Curran & Gibson (1974) being observed. Diazepam (5 mg) was used as 
a tranquillizer and nicotinic acid (50 mg) as a placebo. The latter substance has no central effect but because 
of its curious peripheral influence (vasodilation, tingling, etc.) those who received it alone would not suspect 
an inert dummy. This is possibly important in hypnosis research because we know little about the relationship 
between placebo reaction and suggestibility, even though Evans (1969) has implied that the two are little 
related. 


Design 


The 71 subjects were divided into groups designed to receive the drug (D) and placebo (P) after testing on 
the SHSS Form A. Care was taken that each group received approximately equal numbers from each 
quadrant of the scatter-plot formed by the axes of extraversion and neuroticism of the EPI scores. The 
identity of the individuals allocated to the D and P groups was unknown to the experimenters until after the 
completion of the experiment, the allocation being made by a secretary. In fact, the D group was given 
diazepam plus nicotinic acid and the P group nicotinic acid alone. i 

Subjects had been invited to do the A form of the SHSS without any pressure or incentive, it being 
intended that all should be true volunteers. When they were invited to repeat the experiment with drugs, 
again no sort of pressure was used, but a token incentive of 50p was offered as a recognition of the favour. 
Of the 88 subjects who were invited to repeat, 19 did not respond. Non-responders had the unfortunate 
effect of producing a numerical imbalance between the D (n = 34) and P (n= 37) groups, and between the 
numbers coming from each personality quadrant. 


Hypnotic susceptibility and personality 53 
Table 1. Scores on the SHSS. Forms A and B 


Neurotic Neurotic Stable Stable 

Extraverts Introverts Extraverts Introverts 

A B diff. A B diff. A B diff. A B diff. 

Drug group 

0 1 1 5 4 0 2 2 1 0 -1 

1 3 2 3 1 -2 2 1 -1 1 1 0 
5 6 I 6 5 -1 3 8 5 3 5 2 

9 9 0 5 5 0 6 7 1 

(0 (5 5 11 6 -5 8 12 4 6 8 2 


oO 9 2 (5) (5) 0 (2) C) 5 (3) (0) -3 


(3) = (4) (5) l 
(8) (11) 3 (6) (4) -2 
(6) (6) 0 
(8) (0) -8 
(9) 3) -6 
Placebo group 
0 0 1 0 -1 3 2 0 0 0 
2 1 -i 10 11 1 2 4 2 
3 2 -1 (0) (0) 0 11 -2 4 3 -1 
4 3 -1 (1) (1) 0 4 4 0 
5 8 3 (2) (2) 0 (4) (5) I 6 12 6 
(4) (4) 0 6) 6) 0 
3) Q =I (4) (4) 0 (5 (6) 1 (4) (4) 0 
60 Q -I (6) (5) -1 6) (8) 3 
0 ©) -4 (9) (11) 2 (6) (8) 2 
(9) (10) 1 (9) (11) 2 (10) (7) -3 
0) ©) -I (9) (7) (12) (10) -2 


Eai 
NN 


Males, n = 32 (plain figures); Females, n — 39 (in parentheses). 


Results 


Table 1 shows the scores obtained by the 71 subjects on the Stanford Scale, Forms A and B. 
These subjects are divided according to their scores on the EPI and whether they had received 
drug or placebo prior to the administration of SHSS: B. On the neuroticism (N) scale subjects 
were divided at score > 24 to give a 35:36 split. A similar dichotomy into extraverts and introverts 
was precluded by the resulting small number of subjects in the neurotic extravert (NE) quadrant 
receiving the drug. Subjects were therefore divided into a somewhat uneven split (38:33) on the 
extraversion scale at score 2 27. 

The numbers in the separate personality quadrants vary from 6 to 12. This is largely because 
of the fact, uncontrollable by the present experimental design, that different personality types 
vary in their tendency to volunteer for such a repeat experiment, the stable extraverts being the 
most frequent volunteers. Table 1 also highlights the important effect of sex difference: by an 
unlucky chance only one male appears in quadrant Placebo, NI, and only one female in quadrant 
Placebo, SI. Meaningful analysis of all the relevant variables by the method of analysis of 
variance is therefore precluded by the small numbers in some categories. 
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Table 2. Change in scores from SHSS: A to SHSS: B for each of four EPI quadrants and two 
drug conditions 


Drug Placebo 

Decreased Same Increased Decreased Same Increased 
NE 0 1 5 7 1 2 
NI 5 2 2 4 5 2 
SE 5 2 5 3 1 6 

3 1 3 1 3 2 


Table 3. Frequency of subjects high on neuroticism with respect to extraversion and drug or 
placebo 


Hypnotic susceptibility 
Decreased Increased or same 


Extraverts 0 (7) 6 (3) 
Introverts 5 (4) 4 (7) 


Drug group - plain: Placebo group - in parentheses. 


From the overall results presented in Table 1 it is apparent that the drug group change more 
on retest than the placebo group since the A-B difference scores tend to be larger. Averaging the 
difference scores without respect of sign, the means are 2-235 and 1-351 for drug and placebo 
groups respectively (t= 4-46, d.f. — 69, P< 0-001). It was to be expected that the drug group 
would change more, both increasing and decreasing their susceptibility scores, because they 
were experiencing the centrally acting chemical. An examination of each cell in Table 1 in turn, 
reveals that, as predicted, neurotic extraverts administered the tranquillizing drig do tend to 
increase in hypnotic susceptibility (sign test, P= 0-031). Not predicted was the tendency of 
neurotic extraverts given the placebo to decrease slightly in hypnotic susceptibility (sign test, 
P- 0-09). 

Table 2 shows the number of individuals according to their drug treatment and EPI score 
whose susceptibility score decreased, remained the same or increased on SHSS: B. Study of 
Table 2 shows that for the more stable half of the population, the administration of drug or 
placebo has very little effect on hypnotic susceptibility. But if we study the more neurotic half 
we see that the drug, as compared with the placebo, has had a very significant effect according 
to the level of extraversion. The significance of this effect is best conveyed by the levels of P 
obtained by Fisher's test applied to Table 3. 

The levels of significant relationships between (i) hypnotic susceptibility, (ii) drug vs. placebo, 
(iii) introversion vs. extraversion can best be appreciated by the following statements, applying, 
of course, only to the half of the population high on neuroticism: 


Drug group 
A decrease in susceptibility of introverts vs. extraverts P= 0-042 


Placebo group 
A decrease in susceptibility of extraverts vs. introyerts P= 0-115 
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Table 4. Test-retest correlations (SHSS A/B) with respect to differences in drug/placebo and sex 


Both sexes Males Females. 
All subjects 0-715* 0-806* 0-635* 
(n 2 71) 
Drug 0-523** 0-817* —0-004t 
(n = 34) 
Placebo 0-839* 0-800* 0-867* 
(n = 37) 


Product moment correlations * P < 0-001; ** P < 0-01; f not significant. 


Table 5. Changes in level of hypnotic susceptibility of the females who took the drug 
(comparison between SHSS A and B) 


Initial score on À 
Low (0-4) High (5+) 


A«B 4 1 
A>B 3 8 


By Fisher's test P = 0-077. 


Table 6. Test-retest correlations (SHSS A/B) with respect to sex (drug group only) 


Neuroticism low Neuroticism high 


Males 0-881 0-734 
Females —0-290 0-381 


Extravert group 
A decrease in susceptibility of placebo vs. drug P=0-011 


Introvert group 
A decrease in susceptibility of drug vs. placebo n.s. 


Test-retest correlations with respect to differences of druglplacebo and sex 


The test-retest correlations between the first and second administration of the SHSS are shown 
in Table 4. All correlations are positive and significant except for the 16 females taking the drug, 
where the correlation is virtually zero. This curious fact may be examined further by considering 
the test-retest scores of these females. Their changes in level of susceptibility, with regard to 
their initial level of susceptibility, are shown in Table 5. There is thus a tendency, although not 
very significant (P = 0-077) for females who were given the drug to react according to their 
original level of susceptibility. Most of those who were initially of low susceptibility on Form A 
tended to increase their leve! on Form B. By contrast, the drug appears to have reduced the 
level of susceptibility for most of the females who were initially high on Form A, hence the 
paradox of the zero test-retest correlation for this female group. No such contrast effect is 
demonstrable with females on the placebo, or with the male group. 
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As it has been demonstrated that in the drug group females are somewhat different from males 
in their re-test reaction, it may be questioned whether it is legitimate to pool the scores of the 
two sexes as in Table 3. However, a still further analysis justifies this pooling. The paradoxical 
negative test-retest coefficient applies only for the females below the median on neuroticism, as 
may be seen in Table 6. As the interaction illustrated in Table 3 concerns only those above the 

median on neuroticism, the pooling is justified. 

The effect of the drug vs. placebo on the individual scale items 


The data need to be examined to see whether the group given the drug were in fact initially 
equivalent in susceptibility on all items of the Form A of the SHSS, a fact which cannot be 
assumed because of the complications introduced by some subjects withdrawing. 

Figure 1 shows the relative frequencies of percentages passing each item of the SHSS Form A 
by the groups later given drug or placebo. It may be observed that the placebo group are 
relatively more'susceptible on the majority of items, but testing the levels of difference by the 
chi-square test none of these differences approach significance. 


Postural sway 
Eye closure 
Hand lowering 
Arm immobilization 
. Finger lock 
. Arm rigidity 
Moving hands 


. Verbal inhibition 
Hallucination 


Eye catalepsy 


Posthypnotic ` Later given drug o——o 
suggestion 


Later given placebo x—x 
Amnesia 





I0 20 30 40 50 60 70 80 
Figure 1. Percentages of subjects passing each item of initial test SHSS Form A. 


Form A and B of the SHSS contain items which are similar, but many are not identical. Table 
4 shows that the overall global scores of susceptibility give test-retest correlations which are 
reasonably high, except for the group of females who were given the drug. It is of special 
interest to see the extent to which the two Forms A and B are equivalent item by item. This has 
been done by combining the data of the drug and placebo group on Form B and comparing it 
with Form À as set out in Fig. 2. In no case is the frequency of passing an item Siem 
different on the two scales. 

Three problems have to be considered regarding the increase or decrease of the rate of passing 
on each item on Form B: (i) the relative ‘difficulty’ of the item on Form A and B; (ii) the 
relative effects of drug or placebo; (iii) the different base lines on Form A on each item from 
which the drug and placebo groups begin. It is not sufficient, therefore, to compare the relative 
rates of passing on each item of the drug and placebo groups even though, as shown in Fig. 1, 


1 
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Postural sway 
. Eye closure 
. Hand lowering 

Arm immobilization 
. Finger lock 


Arm rigidity 


Moving hands 


Verbal inhibition 
Hallucination 


Eye catalepsy 


Posthypnotic 
suggestion 


Amnesia 





10 20 30 40 50 60 70 80 
Percentage passing each item 


Figure 2. Percentage of subjects passing each item on SHSS Form A and Form B. 


on no item is the difference statistically significant. The magnitude of the test-retest correlation 
on each item reflects two separate things; (a) the degree to which the same item in the A and B 
scales is psychologically similar, and (5) the relative effect of the drug and placebo treatments. 
The effect of (a) must be discounted before we can arrive at an estimate of (b). 

A simple measure of the test-retest correlation on each item is to consider the number of 
cases discrepant between A and B scales comparing drug and placebo groups, using the 
following comparisons: 


Drug Placebo 
Discrepant a b 
Concordant c d 


Such a table lends itself to the chi-square test if the expected frequencies in each cell are > 5. 

This computation has been done for each of the 12 items of the SHSS scale and the results are 
presented in Table 7, the percentage of cases discrepant also being given. It items 8 and 10 
chi-square values cannot be given as expected frequencies are too small. 

It may be seen from Table 7 that in 8/12 items the drug group gives a relatively greater 
percentage of discrepant cases. This reflects the significant t value obtained earlier for the 
overall scores indicating that the drug, as expected, had more effect on hypnotic performance 
than the placebo. It should be noticed however that the drug does not invariably enhance 
hypnotic performance. In items t, 4, 8 and 12 hypnotic performance is lowered. In only one case 
(item 3, hand lowering) is the relative discrepancy between the A and B scores statistically 
significant, the drug enhancing performance more than the placebo. 


Discussion 


Overall, the drug increased performance more than the placebo, but this is not the most 
important finding of the study. This relative increase is likely to be an artefact of the complex of 
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Table 7. Comparison of test-retest data for drug and placebo groups for each item : 











Drug Placebo 

Per cent Per cent Per cent Chi 

discrepant Change discrepant Change difference square 
1. Postural sway 32-35 - 18-92 - 13-43* . 105 
2. Eye closure 26-47 + 13-51 = 12-96* 1-14 
3. Hand lowering 29-41 + 8.11 t 21-30* 4-04 
4. Arm immobilization 20-59 - 27-03 + 6.44 0-13 
5. Finger lock 20-59 + 10-81 0 9-78" 0-65 
6. Arm rigidity 14-71 + 13-51 E 1-20* 0-04 
7. Moving hands 26-47 + 27-03 + 0:56 0-04 
8. Verbal inhibition 17-65 = 8-11 x 9.54* N.A. 
9. Hallucination 41:18 + 27-03 + 14-15* 1-02 
10. Eye catalepsy 11-76 0 10-81 0 0.95* N.A 
11. Posthypnotic 41-18 + 43-24 - 2-06 0-01 

suggestion 

12. Amnesia 11-76 = 18-92 = 7:16 0-25 








* Percentage discrepancy drug > placebo. 
t P<0-05, df. =1. 


factors of personality and sex associated with the experimental population. The main finding of : 
the study is that the original hypothesis which initiated it has been confirmed. The tranquillizer 
has reduced the level of neuroticism in those high on neuroticism and hence the neurotic 
extraverts have become more hypnotizable and the neurotic introverts less, as predicted by 
Gibson & Curran (1974). This fact is apparent from Table 3 and the computations arising from it. 

The variable of sex has tended to complicate matters, particularly as demonstrated by Gibson 
& Corcoran (1975), that females are over-represented in the neurotic introvert quadrant and 
males in the stable introvert quadrant. The fact that females taking the drug are influenced 
paradoxically according to their original level of susceptibility, as shown in Table 5, may prompt 
one to advance a Yerkes-Dodson explanation of the phenomenon. Those females originally 
rather low in susceptibility may have their tensions removed by the tranquillizer and hence their 
susceptibility is increased, but those females who already have a talent for hypnotic behaviour 
may feel themselves rather out of control when influenced by the drug and hence respond with 
less cooperation. That the paradox is most evident in those females low on neuroticism is 
difficult to explain, and too much speculation on the issue should not be made on the basis of 
these small numbers. 

The same sort of paradoxical response to the drug may account for the lowered susceptibility 
in the drug group on some of the items of the SHSS. The postural sway test is a case in point. 
The percentage discrepant for the drug group was 32-23 per cent and 18-92 per cent for the 
placebo group, both groups declining in susceptibility. Many subjects reported that the drug 
made them feel ‘swimmy’ and it is understandable that in such a condition some may have 
feared that they would fall over and hence tensed up on the postural sway test. Hull (1933) 
showed that tensing of the musculature on this test resulted in a lowered scorable response. 

The one item in which theré was a statistically significant difference between the drug and 
placebo groups in their test-retest performance was item 3, hand lowering. It may be significant 
that this is the easiest test involving as it does a simple ideomotor response to a suggestion that 
is being reinforced by the gravitational force. 
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One aspect of this study which was not investigated was the aspect of placebo reaction. 
Although Evans (1969) has implied that there is little identity between hypnotic susceptibility and 
placebo reaction, the subjects on this study received a peripherally active placebo, nicotinic 
acid, and many of them reported pronounced symptoms of flushing, tingling and increased 
heartrate, possibly subjectively enhanced. They therefore felt that they were ‘drugged’ and 
possible psychological consequences might follow in the hypnotic situation. 

This study illustrates the pointlessness of asking the age old question, ‘does a drug increase 
hypnotic susceptibility?’ Instead it gets us to the point of inquiring what is the specific action of 
a drug on subjective and behavioural manifestations of different people according to their 
varying personalities? Knowing what the changes are likely to be, we can then predict what 
differences are likely to occur in the different types of experience and behaviour which go to 
make up the overall phenomenon of the hypnotic response. 
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Impairment of a motor skill in children with spina bifida cystica and 
hydrocephalus: An exploratory study 


Elizabeth M. Anderson and Ian Plewis 


Twenty 7-10 year old children with spina bifida cystica and hydrocephalus and 20 normals matched for age, 
sex and IQ were compared on a 12-trial target task, first used by Connolly, Brown & Bassett (1968). 
Analysis of the results, in which particular attention was paid to statistical method and to ways of analysing 
individual differences, showed a significant impairment in dotting speed in the spina bifida group, although 
both groups improved with practice. In a second experiment immediately following the first, visual 
monitoring of this task was restricted. The spina bifida children were initially more affected than the 
controls but able to recover. The findings are discussed in relation to neurological abnormalities in the spina 
bifida group. 


Motor (manual) skills in spina bifida children 


Spina bifida is now, next to cerebral palsy, responsible for the largest group of physically 
handicapped children in Britain. The condition results from a congenital malformation of the 
spinal cord and the backbone. The less common type, spina bifida myelomeningocele also have 
consequences but in the much commoner spina bifida myelomeningocele, the imperfectly formed 
spinal cord and its coverings are exposed at birth with resultant lower-limb paralysis and 
incontinence. Over 80 per cent of children with spina bifida myelomeningocele also have 
hydrocephalus. This is a condition resulting from an obstruction of the free flow of 
cerebro-spinal fluid round the brain and is now generally controlled by the insertion of a 
pumping system into the child's head soon after birth. The great majority of these hydrocephalic 
children have an associated malformation of the cerebellum and other lower brain stem 
structures, the Arnold-Chiari malformation. The cerebellum suffers from degenerative changes 
particularly in its central lobes (Variend & Emery, 1974) and with the medulla, pons and fourth 
ventricle, is markedly elongated and displaced downwards into the upper cervical canal (Daniel 
& Strick, 1958; Peach, 1965). 

Recent years have seen a marked increase in research into the specific learning difficulties of 
children with spina bifida and hydrocephalus. However, with a few exceptions (e.g. Wallace, 
1973) the study of manual control in this group has been greatly neglected and it is still often 
assumed that upper limb function is normal. On theoretical grounds, there are three main 
reasons for supposing that hand function might be impaired. First, the cerebellum is known to be 
crucially involved in the control of movement, particularly, recent research suggests, in the 
preprogramming and initiating of rapid ballistic movements (Colloques C.N.R.S., 1974). The 
effects of damage to the cerebellum have been reviewed in detail by Dow (1969) but broadly 
speaking disturbances in the range, direction, force and rate of voluntary movement can be 
expected, these being loosely referred to as ‘cerebellar ataxia’ (Mackenzie, 1963). Secondly, 
damage to the motor cortex resulting from hydrocephalus can affect not only the lower but also 
the upper limbs which may show signs of pyramidal tract abnormalities (e.g. Bell & 
McCormick, 1972; Menkes, 1974). Thirdly, the greatly restricted mobility and the poor balance 
of young spina bifida children (one hand often being used as a prop), as well as long periods of 
hospitalization mean that these children have limited opportunities to practice manual skills. 

Until fairly recently (e.g. Connolly, 1970) comparatively little research has been carried out on 
motor skills even with normal children. One exception was the study carried out by Connolly et 
al. (1968), who investigated changes in the speed and accuracy of a simple motor response in 60 
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boys and girls falling into three age-groups, 6 year olds, 8 year olds and 10 year olds. Using a 
target task of dotting between two circles in which both speed and accuracy were measured an 
increase in speed with age was found, in all three age-groups the girls performing significantly 
faster than the boys. No reliable age or sex differences were found, however, in the accuracy 
component. 

It was felt that this task would provide a useful measure of motor skill in spina bifida children 
and accordingly, as one of several experiments on hand function in spina bifida children, a 
replication of Connolly's study (Expt. I) was carried out with an experimental group of 20 7-10 
year old children (10 girls and 10 boys) with spina bifida myelomeningocele and hydrocephalus 
and a control group of 20 non-handicapped children matched individually for age, sex and IQ. 
The main purpose of the replication was to look at dotting speed (rather than accuracy where 
Connolly et al. found an enormous amount of individual variability), the main prediction being 
that significant differences in speed of dotting would be found between the two groups in favour 
of the non-handicapped children. 

Immediately after the replication of Connolly's 12-trial task had been carried out (Expt. I) a 
modified version of the task was introduced (Expt. II) in which the children's visual feedback 
was restricted, one of the circles into which the child was dotting being screened from his view. 
This was very much an exploratory experiment. Some unpublished work had been reported to one 
of us in which the performance of ‘clumsy’ children on a target task was found to differ 
significantly from that of normal children only when the amount of visual feedback available was 
reduced forcing them to place greater reliance on kinaesthetic feedback. The work of Ayres 
(1965) and others has shown that kinaesthetic ability is often impaired in brain-damaged children 
and it was expected that this might be true of the spina bifida group. It was thus predicted that a 
restriction of visual feedback would result in a decrement in the performance of the spina bifida 
children, as compared to performance under normal conditions, but not in that of the controls. 


Method 
Subjects 


In both experiments 20 children with spina bifida and hydrocephalus took part and 20 controls matched 
individually for sex, age (to within 6 months) and IQ on the Columbia Scale of Mental Maturity. The mean 
Columbia IQ of the spina bifida group was 90-3 (range 72-124) and their mean age 8 years 9 months (range 
7.2-10-2) while the corresponding figures for the non-handicapped children were 90-2 (range 68-125) and 8-8 
years (range 6-11-10-0 years). In each group there were ten boys and ten girls, half of these being ‘younger’ 
children aged 7-814 years and half ‘older’ children aged 812-10 years, giving altogether eight subgroups of 
five children each. All the spina bifida children attended special schools, and all the non-handicapped 
children ordinary schools except three who were in ESN schools. 


Materials and procedures 


These are described in detail in the Connolly et al. paper (1968) and since Expt. I was a replication of his 
study only a brief account will be given here. Two circles each with a radius of 1 in whose centres (not 
marked) were 5 in apart were drawn on blank 8x5 in cards and the child, using his preferred hand, was 
required to dot with a pencil between one circle and another. The importance of both speed and accuracy 
(‘as near the middle as possible’) were stressed. The task was first demonstrated by the experimenter then a 
practice trial was given. There were 12 trials proper, each of 5 sec, the stopwatch being started as the first 
dot was made. Knowledge of results (i.e. number of dots made) and encouragement was given after each 
trial. : ‘ jn 

In Expt. II (restricted visual feedback), an adjustable curtain of opaque black material was suspended from 
an expandable curtain-wire hooked to two uprights which were clamped to the front of the child's table. The 
card onto which the child dotted was placed behind this curtain which was then adjusted so that only one of 
the circles was visible (left-hand one for right-handers, right-hand one for left-handers). The child placed his 
forearm under the curtain, and dotted from one circle to another as before, starting with the circle which was 
visible. The curtain material was so light that freedom of movement was in no way restricted. In order to 
maintain a high level of motivation only six trials were given but otherwise the procedure was as in Expt. I. 
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Table 1. Expt. I. Cell means and standard deviations 











Spina bifida (n = 20) Non-handicapped (n = 20) 
Boys Girls Both Boys Girls Both Total 
Younger 10-93 11-10 11-01 13-08 14-08 13-58 12-30 
2:13 4-51 3-33 1-91 2:45 2.14 3-10 
Older 11-53 10-41 10-97 14-78 16-46 15-62 13-30 
3.45 2.82 3-03 1-47 2-70 2:23 3.52 
Total 11-23 10-78 10-99 13-93 15-27 14-60 12:80 
2:72 3-56 3-10 1-85 2-74 2:37 3.28 








Table 2. Expt. I. Analysis of variance and least squares estimates of effects 


Source d.f. ; F value P Estimate of effect 
Between subjects 
Handicap 1, 32 16-21 <0-01 SB -NH = -3-62 
Sex 1, 32 0-24 >0-10 Boys — Girls = —0-44 
Age 1, 32 1-26 > 0-10 Younger - Older = —1-00 
Within subjects 
Trials 11, 22 5-71 <0-01 
Linear trend* 1, 22 78-69 < 0-01 0-36 
Handicap x trials 11, 22 0-89 » 0-10 
Linear trend* 1, 22 3-77 < 0-07 SB 0-28 NH 0-44 
Sex x trials 11, 22 0-70 >0-10 
Linear trend* 1, 22 1-88 > 0-10 Boys 0-30 Girls 0-42 
Age x trials 11, 22 0-41 > 0-10 
Linear trend* 1, 22 0-78 >0-10 Younger 0-33 Older 0-39 


* Higher order terms generally had probability values > 0-10 (see text). N.B. All other interaction terms had 
probability values > 0-10. 


Scoring 


Each target card provided two scores, one for speed and one for accuracy. The score for speed was the total 
number of dots falling inside the two target circles. The accuracy score reflected the distance of each dot 
from the centre and was obtained as described by Connolly et al. (1968). However, the main focus in this 
paper will be on dotting speed rather than accuracy. 


Results 


An overall picture of differences between groups for Expt. I. is given in Tables 1 and 2. 

I. The between-subjects analysis followed the usual pattern with estimates of the main effects 
of all factors shown. For the within-subjects analysis it is possible not only to calculate whether 
or not the effects are significantly greater than zero but also, by using orthogonal polynomials 
(see, for example, Kendall & Stuart, 1973), to estimate the linear, quadratic, cubic, etc., terms 
of the relationship between speed and time and thus obtain a clearer picture of the effect of 
practice or fatigue. 

2. The main results of this part of the analysis can be summarized as follows (Fig. 1). There is 
a marked handicap effect with spina bifida children dotting about 25 per cent more slowly than 
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Mean number of words per trial 





1-2 3-4 5-6 7-8 9-10 11-12 
Trial number 
Figure 1. Mean number of dots made by spina bifida and non-handicapped boys and girls over 12 trials. 
D——H, SB boys; O- - -O, SB girls; S ——-m, NH boys; e- - -e, NH girls. 


the non-handicapped, this effect explaining 31 per cent of the overall variability. All'groups show 
a positive practice effect over the 12 trials (gaining about 1 point every three trials) but there is 
slight evidence that the spina bifida group improve more slowly. However, as shown in Fig. 1, 
the spina bifida children are, after 12 trials, doing as well as the controls were on the first trial. 
For both groups there is quite a marked increase in dotting speed between trials 9/10 and 11/12, 
probably because the children were told trials 11 and 12 were the last two trials and were 
determined to beat their previous records. 

3. The interpretation of the analysis of variance, is however, subject to two theoretical 
constraints. First, the matching procedure nullified the assumption of independent samples 
within groups and thus gave 'conservative' F ratios for the between-subjects analysis. The effect 
of matching is indicated by the correlations across factor levels; these were all low and further 
analysis showed that in fact the matching had little influence on the power of this experiment. 

Secondly, does the inclusion of a further polynomial component in the trend analysis add a 
statistically significant amount to the explained sum of squares? Step-down F tests are useful for 
this purpose (Roy & Bargmann, 1958) but are biased if the error correlation matrix for the 
transformed variables is not ‘uniform’ (Geisser, 1963). This uniformity is equivalent to constant 
or no correlation between scores on each and every trial. It did not exist for these data although 
the departure from constant correlation across trials was not great. Since the overall F values for 
the trials effect and its interactions are multivariate F ratios they do not suffer from this biasing 
problem. However, it would seem prudent to disregard the few higher order trend terms with 
probability levels less than 0-1. The F ratios for the linear trends are similarly biased; this is 
unlikely to affect the overall trend but the handicap xtrials trend should be treated with some 
caution. 
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Table 3. Expt. I. Distribution of mean dotting scores (mean no. of dots per trial over 12 trials) 


Mean number of dots per trial 


5-0-9-9 10-0-14-9 15-0-19-9 20+ 
Group QR % n % n % n % Range 
SB group (n=20) 9 45 9 45 2 10 0 0 5-00-16-00 
NH group (n= 20) 0 0 12 60 7 35 1 5 10-25-21-15 


Table 4. Expt. I. Distribution of linear trends 











Linear trend 

<0 0-0-25 0-25-0-5 0-5-1:0 Mean S.D. Range 
Spina bifida (n = 20) 2 8 7 3 0-28 0-26 —0-14-1-00 
Non-handicapped (n — 20) 1 5 4 10 0-44 0-26 —0-08-0-89 








Table 5. Expt. I. Correlational structure of individual differences in linear trend 











Spina bifida Non-handicapped 
Linear Score Linear Score 
trend Trial 1 IQ Age trend * Trial 1 IQ Age 
Linear trend 1-00 1-00 
Score, Trial 1 0-02 1-00 —0-03 1-00 
IQ 0-58 —0-01 1-00 -0-11 0-32 1-00 
Age 0-19 —0-10 —0-24 1-00 0-37 0-00 —0-32 1-00 








An examination of the residuals provided confirmatory evidence of the plausibility of the linear 
model for the subjects as a whole. Indeed it explained 91 and 96 per cent of the between-trials 
variation for the spina bifida and non-handicapped groups respectively. 

4. In Table 3 the nature of individual as opposed to group differences for level of dotting 
speed are given. This shows clearly that almost half of the spina bifida children did worse than 
any of the control group whereas only 10 per cent of the spina bifida children did as well as the 
upper 40 per cent of the controls. 

It is also possible to calculate linear trends (i.e. practice effects) for each subject, the method 
being identical to that used for groups with a regression equation fitted to individual rather than 
to mean scores on each trial. Table 4 gives the results and shows a high level of variability 
within both groups with, as for level of douing speed, the spina bifida group more variable. 
However, it is interesting to note that the overlap is considerably greater. Clearly, the category 
‘spina bifida’ is not particularly useful in predicting the practice effect for this task even though 
the group differences appear to be quite substantial. 

To what extent can this within group variability be explained? Is the practice effect associated 
with IQ, age or initial speed? Relevant correlations are presented in Table 5 and although the 
results cannot be regarded as conclusive because the sample was small there is some evidence 
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Table 6. Expt. II. Cell means and standard deviations = 


Spina bifida Non-handicapped 
Boys Girls Both Boys Girls Both Total 
Younger 11-36 11-49 11-43 15-66 15 50 15-58 13-50 
3.37 6-00 4-59 1-78 3-30 2:50 4-18 
Older 12-36 10-63 11:50 18 66 18-33 18-50 15-00 
3-06 5:07 405 3-61 3-93 3.56 5:16 
Total 11-86 11-06 11-46 17 16 16-91 17-04 14-25 
3-08 5-25 422 3-11 3.73 3.35 4-70 














Mean 

linear trend S.D. Range 
Spina bifida 0-42 0-55 —0-80-1-49 
Non-handicapped 0-29 0-63 —0-97-1-54 








that IQ influences the rate of improvement in the spina bifida children while age is more 
important for the non-handicapped group with the older children improving faster than the 
younger ones. 

Although the linear model appears reasonable for the groups as a whole, is it satisfactory for 
individual children? Residuals for each child were examined and although a few showed unusual 
patterns, there was no systematic divergence such as one might expect if, for example, fatigue 
had set in. Not surprisingly, however, individual variation about the fitted regression line is 
greater than the group variation. The mean explained sum of squares is 37 per cent for the spina 
bifida children and 47 per cent for the controls compared to figures of 91 and 96 per cent for the 
groups as a whole. 

5. The two within-subjects factors, conditions and trials were confounded and this aspect of 
the design inevitably weakens the results obtained for the six trials involving restricted visual 
feedback. In particular, the prediction for this experiment which was equivalent to an expected 
interaction between ‘conditions’ (i.e. complete visual feedback as opposed to restricted visual 
feedback) and ‘handicap’ could not be satisfactorily included in an analysis of variance. 

The cell means and s.D.s are given in Table 6 and the analysis of variance gave similar results 
to those obtained under the ‘normal’ condition with a significant handicap effect (P< 0-01) and, 
within subjects, a significant overall linear trend (P< 0:01). This time, however, the values for 
the linear trend show a different pattern. The spina bifida children have a mean improvement 
rate of 0-42 against 0-29 for the non-handicapped group and when all the younger children are 
considered as one group their mean improvement rate is 0-47 while the corresponding figure for 
the older children is 0-24. Although neither of these differences is statistically significant, it is 
true that the introduction of the new condition upset the spina bifida children more and for both 
groups, the younger children were more affected. The mean drops were 3-8 and 1-9 for younger 
and older spina bifidas and 1-0 and —0-2 (i.e. a gain) for younger and older controls. 

6. When individual differences are considered, it can be seen that the non-handicapped group 
shows a higher level of variablity (Table 7) than the spina bifida children for this condition. 
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Table 8. Expt. II. Correlational structure of individual differences in linear trend 


Spina bifida Non-handicapped 
Linear Trial 12- Linear Trial 12- 
trend Trial 13* IQ Age trend Trial 13* IQ Age 
Linear trend 1-00 1-00 
Trial 12-Trial 13 0:71 1-00 0-68 1-00 
IQ 0-25 0-06 1-00 -0-31 —0-28 1-00 
Age —0:31 -0.27 —0-24 1-00 —0-09 —0-08 -0.32 1-00 








* First trial of Expt. II. 


For both groups the greater the initial disturbance the higher the linear trend and this holds 
when age is partialled out (Table 8). The influence of IQ and age on rate of improvement does 
not appear to hold for this condition. However, again individual residuals are reasonably well 
behaved. 


Discussion 


Since Expt. I was closely based on the Connolly et al. study (1968) the first question asked was 
the extent to which results obtained for the non-handicapped group replicated their findings. 
Connolly et al. obtained mean dotting scores of 15-73 and 17-73 for their 8 year old boys and 
girls: the corresponding figures for the non-handicapped group (on average, 10 months older) 
were slightly lower, 13-93 and 15:27. 

Connolly and his colleagues also obtained significant age effects in favour of the older children 
and sex effects in favour of the girls. The absence of an age effect here was probably because 
the age difference between each of the three groups in Connolly's study was two years 
compared to only one year here and secondly because of the large amount of variability within 
groups, particularly for the spina bifida children. The non-handicapped children do show a trend 
in the expected direction (mean for younger group 13-58 and for older group 15 -62), whereas this 
was not true of the spina bifida children. Similar comments can be made about the absence of a 
sex effect. The difference between the mean scores of the non-handicapped boys and girls (1:34 
points in favour of the girls) is very similar to that obtained by Connolly (1:8 points); again this 
did not hold for the spina bifida group where the variability was also greater. Thus overall the 
non-handicapped children performed much as expected on the basis of Connolly's findings. 

Although accuracy was not of major interest it is worth recording that the results here were 
very similar to those of Connolly. Three main points can be noted. First, the mean accuracy 
scores for the non-handicapped children did not differ significantly from those obtained by 
Connolly. Those of the spina bifida children were almost identical with those of the 
non-handicapped group, their slower speed not being compensated for by greater accuracy. In 
both groups, as in the Connolly study, the girls were slightly but not significantly more accurate 
than the boys. Secondly, as in the Connolly study, the increase in speed over trials was 
paralleled, in both groups, by a deterioration in accuracy. Thirdly, again as Connolly found, the 
amount of variability within each group was striking. 

In all studies of motor skill it is essential to ensure that the subjects are highly motivated, 
particularly handicapped children, who may have been shielded from competition and are often 
satisfied with a low level of performance which does not reflect their true potential. Anxiety in 
new situations and concentration problems are also common. In this study all the children had 
already met the experimenter for testing sessions on three occasions and were keen to come 
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again. They enjoyed competing against themselves and motivation was high throughout. Because 
each trial lasted only 5 sec a high level of concentration was maintained. 

The most striking finding in Expt. I was, as predicted, the marked impairment of motor speed 
in the spina bifida group. With a mean age of 8 years 9 months they were, on average, dotting 

"slightly more slowly than the 6 year olds in the Connolly study. Since an inverse relationship has 
been reported between speed and accuracy it is conceivable that an over-concern with accuracy 
slowed them down but this seems highly unlikely. Observing the children, it was quite clear that 
in both groups the total number of dots made rather than their nearness to the centre was 
all-important. 

It is suggested that several factors contribute to the poor motor performance of this group. 
The first and most striking is impaired effector (manual) control. Neurological testing carried out 
by Wallace (1973) indicated that at least 80 per cent of hydrocephalic spina bifida children suffer 
from upper limb dysfunction, ataxia, tremor, and muscle weakness all being common. The 
predominant disorders are cerebellar ataxia (a highly likely consequence of the Arnold-Chiari 
malformation) or pyramidal tract dysfunction or a mixture of both. Secondly, dotting between 
two circles requires more than simply effector control: the task also makes demands upon 
visuo-perceptual and visuo-motor skills. The results of psychometric and experimental work 
carried out by one of us (Anderson) and by others (e.g. Miller & Sethi, 1971; Spain, 1974; 
Dodds, 1975) indicates that many of these children also have central processing deficits affecting 
two main aspects of visuo-spatial performance, sensory organization and motor organization 
processes. Thirdly, this task is clearly greatly affected by practice and undoubtedly most spina 
bifida children will come to it having had far less opportunity than the controls to develop their 
motor skills through appropriate experiences. 

Because the trial periods were so short systematic observations of the quality of performance 
during testing could not be made but it did appear that some of the spina bifida children grasped 
the pencil in an awkward and immature way and too tightly. They also had greater problems in 
controlling the forcefulness of their movements, occasionally punching holes in the card or 
breaking the pencil point. Some found it difficult to control the movement of the pencil point on 
the page: it would slide on the surface so that / rather than . was produced, this also being true 
of some of the younger non-handicapped children. Another problem was one of technique, with 
more of the spina bifida children than the controls tending to treat the task more as a series of 
discrete movements rather than as a continuous whole. Connolly's findings suggested that 
improvements in technique were age-related: in this respect the performance of the spina bifida 
children seemed more typical of the younger normal children. Finally, it was noticeable that the 
spina bifida children often got off to a slower start than the controls: they gave the appearance 
of taking longer to process the instruction ‘go’. 

The findings concerning rate of improvement were also interesting. Although the spina bifida 
children tended to improve more slowly than the controls, almost all of them did make significant 
improvements over the 12 trials: conversely there was no evidence of fatigue or of a decrement 
in attention, although ‘poor concentration’ and ‘tires easily’ are remarks frequently applied to 
spina bifida children. The findings here thus clearly suggest that within the constraints which 
neurological impairment may impose ‘spina bifida children can, like non-handicapped children, 
improve their motor performance with practice. For neither group was it possible to say how 
close the children were to reaching a plateau in level of performance. One other point needs to 
be noted here, this being the relationship between intelligence and motor skill. It tended to be 
the more able spina bifida children who showed the greatest rate of improvement over the 
12 trials (correlation of 0-58 between intelligence and improvement rate) whereas these factors 
were unrelated in the non-handicapped group. 

Next a few comments must be made about the findings in Expt. II. This, it will be 
remembered, was carried out to see whether the spina bifida children would show a greater 
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decrement in performance than the controls when forced to rely at least partly upon kinaesthetic 
rather than visual feedback. Contrary to predictions the spina bifida children were not, as a 
group, unable to cope with the restriction on visual feedback. Although their immediate reaction 
to the introduction’ of this condition was, in most cases, a marked drop in dotting speed (a 
reaction similar to that of the younger but not the older non-handicapped children) this was only 
a temporary reaction from which the majority were able to recover, with those children (in both 
groups) who had been most thrown out by the new condition showing the largest practice 
effects. 

This finding suggests that kinaesthetic ability is not markedly impaired in spina bifida children 
as a group, although it is of course possible that they were using some other form of feedback 
such as ‘central monitoring of efference’ a form of feedback which has been discussed in some 
detail in recent papers (e.g. Angel, Garland & Fischler, 1971; Cohn, Jaknunias & Taub, 1972; 
Jones, 1974). Only one spina bifida and one control group child failed completely to reach their 
former level of performance under the new condition. Both were characterized by low IQs (the 
non-handicapped child attended an ESN school), marked attention problems and a high degree 
of impulsivity. 

It was noted by Connolly et al. (1968) and in this study that some children treated the task 
(under normal conditions) as a series of discrete movements, looking from one circle to another 
as they moved the pencil, while others treated it as a continuous whole, concentrating mainly on 
watching one circle. In the restricted visual feedback condition all the children were forced, as it 
were, to adopt the latter technique at least as far as eye movements were concerned. Although 
we have no evidence of this it seems likely that the children who continued to improve 
steadily when Expt. II was introduced, i.e. the older non-handicapped children, were those who 
were already (voluntarily) depending more on kinaesthetic rather than on visual feedback, 
whereas those who had previously checked the position of each dot visually took a little time 
(during which performance deteriorated markedly) to adapt to the new technique which was 
forced upon them. 

It has been shown that the majority of children with spina bifida and hydrocephalus showed 
marked impairment in speed on a target task. They were, however, able to improve considerably 
with practice although there was a great deal of individual variation related, to some degree, to 
intelligence. The task used, with its demands on motor control, eye-hand coordination and 
judgement of distances is typical of many manual tasks, and the findings on other tasks of motor 
skill on which these same children were tested (including a hand function test, a tracing test, 
and a variety of copying and writing tests), were very similar. Thus, although the sample was 
admittedly small, and although considerable variation existed within the spina bifida group, we 
feel able to predict with some confidence that well over half (and probably a much higher 
proportion) of children with spina bifida and hydrocephalus are likely to show a considerable 
degree of impairment in performance on a wide variety of motor tasks. There were also, 
however, clear indications that most of these children will respond well to practice and since few 
systematic attempts have so far been made to give the children such practice, especially in the 
first two years of life when the cerebellum in particular is developing rapidly, it is not possible 
to predict where the limits to improvement lie. 
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The role of semantic information in short-term memory in children aged 5 
to 9 years 


N. E. Wetherick and Jane Alexander 





A previous finding that adults recalled more words if the words were drawn from one semantic category than 
if they were drawn from more than one was confirmed for children down to age five. The adult tendency to 
store words in order of presentation (resulting in a substantial primacy effect) was however largely absent in 
5-6 years olds and present only in a weaker form in 7-9 year olds. The recency effect affecting the last word 
in a list was as strong in 5 year olds as in adults but the rate of decline in recall backwards through the serial 
positions was much sharper both in 5-6 year olds and in 7-9 year olds than it was in adults. 





Wetherick (1975) employed a technique for the study of memory in which lists of eight 
single-syllable words drawn from one, two, four or eight semantic categories were presented for 
immediate verbal recall. His (adult) subjects showed a capacity to recall more words where the 
words were drawn from fewer semantic categories, the best recall being obtained where all the 
words came from one category. They also showed a strong inclination to recall the words in the 
order of presentation (whether or not instructed to do so) which accounted for the primacy 
effect observed, and a recency effect favouring the last word in the list which did not, however, 
depend on any tendency to recall the last word first. He interpreted these results as evidence in 
support of Craik's 'alternative framework' for human memory research (Craik & Lockhart, 
1972; Craik, 1973) suggesting that words may pass directly to corresponding memory traces and 
that the possibility of retrieval may be affected by pre-existing structural relationships between 
the traces which may be close (if the words come from one semantic category) or distant (if they 
come from many). He further suggested that sequential tagging may, in Craik's terms, represent 
a memory function involving ‘continued attention to one aspect of the situation’ (i.e. ordinal 
position in the list). 

It is clearly of interest to establish how far the same phenomena may be observed in the 
performance of young children since, as Jablonski (1974) points out, there is no generally 
accepted theoretical account of the development of recall. The present study attempted to apply 
the technique to children aged from five to nine years. The technique requires that there should 
be traces in the long-term store corresponding to the words employed. It was therefore 
necessary to prepare lists of words known to children of that age and meeting the other demands 
of the task. Only four of the original semantic categories survived intact (parts of the body, 
articles of clothing, foodstuffs, numbers) and two with modifications (colours, names). Five new 
categories were adopted; classroom objects (board, chalk, desk, etc.), kinds of movement (jump, 
walk, climb, etc.), toys (gun, swing, doll, etc.), domestic objects (spoon, dish, plate, etc.) and 
room furnishings (chair, rug, shelf, etc.). Lists of four, six and eight words were prepared but 
children aged five and over were found to be capable of coping with six-word lists. Thirty 
children aged between five and nine were tested on six-word lists but of these seven could not 
cope with the eight-word lists subsequently presented. 


Methods 


Twelve six-word and 12 eight-word lists were constructed, drawn from one, two, three or six categories 
(six-word lists) or from one, two, four or eight categories (eight-word lists). These will be referred to 
respectively as 6-1, 3-2, 2-3 and 1-6 lists or as 8-1, 4-2, 2-4 and 1-8 lists; ‘6-1’ meaning six words from one 
semantic category. Each subject repeated back three lists of each type, six-word and eight-word lists on 
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successive days. It was not found possible to get the children to attend to a pre-recorded list, the lists were 
therefore read by the second author individually to each child at as nearly as possible a uniform rate of 60 
word/min. 


Instructions 


*[ am going to read you some lists of words. Once I have finished reading a list I would like you to say back 
to me all the words you can remember in any order. Do you understand what you have to do. ..? Don't 
worry if you can't remember some words, I am not testing you.’ 


Subjects 

Twenty-three children from Edinburgh and Aberdeen primary schools, selected on the teachers' advice as 
‘typical’ in their class. Six were aged 5, five aged 6, four aged 7, two aged 8 and six aged 9. Results will be 
presented for two groups; 5 and 6 year olds (n = 11: 6F, 5M) and 7, 8 and 9 year olds (n= 12; 6F, 6M). 


Results 


Three scores were calculated for each list: $, (number of words recalled in order of 
presentation); S, (number of words recalled irrespective of order); and S, (number of words 
recalled in groups related in meaning). $, was calculated by scoring 2, 3 and 4 or more for 

every sequence of two, three, four or more words recalled in order of presentation. $5 was 
calculated by scoring 2, 3, 4 or more for every sequence of two, three, four, or more words 
recalled from the same semantic category. S, can only be calculated for 3-2, 2-3, 4-2 and 2-4 
lists since in the remaining types of list words come either from one semantic category (in which 
case they are all related in meaning) or from six or eight different categories (in which case none 
of them are related). 

Table 1a, b and c shows the mean scores obtained. In every case the older children obtained 
higher mean scores than the younger. In brackets for comparison are the mean scores of the 
adult group that worked under comparable conditions (i.e. FR instructions, 60 word/min) taken 
from Wetherick (1975). 

Table 1a shows S, scores. In the 5-6 year old group there were no significant differences 
between scores on different types of list or between scores on six-word and eight-word lists. In 
the 7-9 year old group there were still no significant differences between types of list but scores 
were significantly higher on six-word lists than on eight-word lists (sign test, P< 0-01, 
two-tailed). The S, (order) scores of this group on six-word lists were comparable with adult S, 
scores on eight-word lists. 

Table 1b shows S, scores. In the 5-6 year old group, scores were significantly higher on 8-1 
lists than on 6-1 lists (sign test, P< 0-025, two-tailed), but there were no significant differences 
between scores on eight-word and six-word lists of the remaining types. There were no 
significant differences between different types of six-word list but scores on 8-1 lists were 
significantly higher than on the remaining types of eight-word list (sign test, P< 0-01, two-tailed). 
In the 7-9 year old group scores were again significantly higher on 8-1 than on 6-1 lists but were 
significantly lower on the remaining types of eight-word list than on the corresponding six-word 
lists but scores on 8-1 lists were again significantly higher than on the remaining types of 
eight-word list (sign test, P « 0-001, two-tailed). The superior recall of words drawn from one 
semantic category, characteristic of adults, is already present in both groups of children on 
eight-word lists and, in the younger group, on six-word lists. Where more than one semantic 
category is involved, however, recall scores are lower on eight-word than on six-word lists in the 
older children and no higher in the younger children. This result apears to follow from the fact 
that, regardless of the semantic structure of the list, the older children achieved adult S, (order) 
scores on six-word lists but not on eight-word lists. Thus the semantic advantage of 6-1 lists, 
over the remaining types of six-word list was cancelled out but not the corresponding advantage 
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Table 1. Mean number of words recalled out of 6 or 8. (a) S, - correct in correct order; 
(b) S, - correct in any order; (c) S, — correct in semantic category 








(a) (b) (o 
S, score S, score S, score 
Type of list. ... 6-1 3-2 2-3 1-6 6-1 32 2-3 1-6 3-2 2-3 





Six-word lists 
5-6 year olds 1-36 1-06 1-09 1:39 327 2-75 2-97 2.85 0-69 0-69 
(n= 11) 
7-9 year olds 2-33 2-53 2-52 2-25 3-97 4-36 4-00 3-61 1-61 0-72 
(n = 12) 


Type of list ... 8-1. 42 2-4 1-8 8-1 4-2 2-4 1-8 4-2 2-4 
Eight-word lists 
5-6 year olds 1-73 0:76 . 0-69 112 4-21 292 28 2N? 1446 0:36 
is Mat dde 2-05 0-97 | 103 1-22. 504 3-49 3-33 3.31 1:8 0-64 
TN (2:96) (256 (267 (262 (603) (569) (5-2) (490) (3-09) (1:04 
n- 


eee 





of 8-1 lists, and the remaining types of six-word list received a substantial advantage over the 
corresponding types of eight-word list. 

Table 1c shows S, scores. S scores on lists involving three or four semantic categories were 
very low (as they are in adults). In lists involving two categories, the 5-6 year olds scored more 
highly on eight-word lists than on six-word lists (sign test, P«0-05, two-tailed). 

Figure 1 shows serial position curves for the two groups of children and the comparable adult 
group. The curves are plotted on a percentage correct basis since the group n's differed. There 
are no data for adults on six-word lists and the curve shown is that for eight-word lists with 
serial positions 4 and 5 omitted, this does no serious violence to the eight-word data since scores 
at positions 4 and 5 were asymptotic. The (artificial) curve for adults and the curve for 7-9 year 
old children are virtually coincident at all serial positions except 3. The curve for 5-6 year old 
children is coincident with the adult curve only at positions 5 and 6. There appears to be a 
recency effect independent of age which is affecting the end of the list but the decline backwards 
through positions 5, 4 and 3 is steeper in younger subjects. In adults and 7-9 year olds, there is 
also a strong primacy effect. i 

The curves for eight-word lists show coincidence at serial position 8 only, but on eight-word 
lists the performance of 7-9 year olds resembles that of 5-6 year olds more than it resembles that 
of adults. The decline backwards through positions 7, 6, 5, 4 and 3 is almost identical in the two 
groups of children and much sharper than in the adult group. The 7-9 year olds do, however, 
show a substantial primacy effect at serial positions 1 and 2 which is absent in the 5-6 year olds. 

Wetherick (1975) pointed out that the recency effect on the last item in the lists he employed 
was not entirely a consequence of the subject recalling the last item first. His subjects tended, in 
fact, to recall the last item last if at all, particularly at the fast speed. A similar analysis of the 
results obtained in the present experiment is shown in Table 2. There was no difference between 
age groups in this measure, figures for the comparable adult group are shown in brackets. On 
eight-word lists the children’s performance was almost exactly comparable with that of adults. 
Comparing their performance on six- and eight-word lists however it appears that recall of the 
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1 2 3 4 S 6 1 2 3 4 5 6 7 8 
All instructions six-word lists All instructions eight-word lists 


Figure 1. Serial position effects in six- and eight-word lists (taken from Wetherick, 1975). Children (5-6 
years) A—A; children (7-9 years) @—@; Adults 0—0. N.B. There are no adult data on six-word lists. The 
curve shown is the eight-word curve with serial positions 4 and 5 omitted. 


last word ‘last but two or more’ (i.e. near the beginning of the recalled list) was a reaction to 
overload — since it is less common on six-word lists and recall ‘last’ correspondingly more 
common. The last word was in fact recalled first on 29-2 per cent of eight-word lists but on only 
21-6 per cent of six-word lists. 

Intrusions and repetitions were frequent, particularly in the younger children, and a word that 
was repeated once might then be repeated over and over again. Murdock (1974, p. 221) has 
pointed out that ‘the retrieval mechanism of an individual is not designed to sample without 
replacement’, and that this may be particularly noticeable in children. 

Discussion 

These findings tally with previous findings in the area. Cole, Frankel & Sharp (1971) showed that 
the conditions of our experiment are among the most difficult for children (since we presented 
words rather than objects or pictures, in unblocked rather than blocked categories and required 
verbal rather than written recall). They found increased recall with age as we did and, in 
addition, found that the factors affecting recall did not interact with age, implying uniformity of 
development. Thurm & Glanzer (1971) found as we did that older children's recall was no better 
than that of younger children at the end of the list, their superiority appearing at the beginning 
and in the middle of the list. Hagen & Kingsley (1968) showed that labelling (which presumably 
affects semantic characteristics) improved recall of words at the end of the list only. 

The recency effect on the last word of the list appears from these results to be constant from 
age five to maturity. Craik (1968) has shown that it remains constant through the normal 
life-span (his subjects were aged from 22 to 65). There is in all subjects a decline in recall of 
words relatively more remote from the last serial position which gets progressively sharper in 
younger children and in older adults (Craik, 1968). There is, however, also a tendency to store 
words in sequential order (whether or not the subject has been instructed to do so) which 
favours early words in the list. This tendency is virtually absent in 5-6 year old children; it is 
present at the adult level in 7-9 year old children on six-word lists but is much weaker in these 
children on eight-word lists. In 7-9 year olds the last words in the longer list seem partially to 
erase the information on which the primacy effect depends. Something similar occurs in adults 
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Table 2. Number of occasions on which the last word in the list was either not recalled, recalled 


last or recalled in some earlier position 





Last word in the list 





Not Recalled Recalled last Recalled last 
recalled but one but two or more 
Six-word lists (children 72 41 40 
aged 5-9, n = 23) 
Eight-word lists (children 62 42 84 
aged 5-9, n = 23) 
Eight-word lists (61) (53) (84) 
(adults, n = 23) 








[see e.g. Murdoch (1962) where presentation of lists of 10, 15, 20, 30 and 40 words always showed 
a primacy effect but of progressively smaller absolute size]. 

The superiority of lists drawn from one semantic category over lists drawn from more than 
one is already apparent in children aged five. Presenting eight words rather than six from one 
category results in a significantly higher S, (total) score with an S, (order) component that is not 
significantly smaller. Where, however, more than one category is involved eight-word lists result 
in significantly lower S scores than six-word lists, with substantially lower S, components; so 
that the facility with which sequential tags may be attached to words seems also to depend on 


how closely related the words are semantically. 
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Sequential contrast effects with retarded subjects after discrimination 
learning with and without errors 


Jean-Luc Lambert 





Four retarded adults were exposed to a multiple variable interval-extinction schedule of reinforcement. The 
2 min components were presented in a random order. Two subjects acquired the discrimination with an 
errorless procedure. The two other subjects learned the discrimination with errors. The schedule generated 
sequential contrast effects under both training procedures: response rates during periods of reinforcement 
were higher when a reinforcement period followed an extinction period than when it followed another 
reinforcement period. These results were confirmed in a second observation where eight retarded children 
were exposed to a multiple fixed ratio-extinction schedule of reinforcement. This study is the first 
demonstration of sequential contrast effects with human subjects during the acquisition of a discrimination 
withgut errors. The results are discussed in terms of Terrace’s theory of errorless learning. 





The schedule effect known as ‘behavioural contrast’ has been repeatedly demonstrated in 
pigeons (Reynolds, 1961; Terrace, 1963 a; Bloomfield, 1967). This effect is observed when the 
subject is successively presented with two stimuli. The rate of responding during one stimulus is 
altered by changing the schedule of reinforcement associated with the other stimulus. The 
change in response rate during one stimulus in a direction away from the response rate during 
the other stimulus is called behavioural contrast (Reynolds, 1961). The effect is commonly 
observed on two-component multiple schedules, where two stimuli are alternatively presented, 
each correlated with independent schedules of reinforcement. To produce the contrast, the 
responses are first reinforced-in both components according to identical schedules of 
reinforcement. After rate stabilization, reinforcement is no longer delivered in one component 
(S—), while the schedule in the other component (S+) remains the same. This change in 
procedure results in an increased rate of responding during S+. 

An extensive study of this contrast effect has been made by Terrace (1966) by programming a 
multiple schedule in which S+ and S— were presented in a random sequence rather than in 
alternation. S+ components were programmed to occur after both S— and other S+ 
components. Terrace (1966) observed that response rates during S+ components that followed 
S— components were greater than response rates during S+ components that followed other S+ 
components. This phenomenon was called ‘sequential contrast effects’. In human subjects, 
behavioural contrast has been reported in normal children (Nicholson & Gray, 1971), while 
sequential contrast effects have been observed in retarded subjects (O'Brien, 1968). 

Most observations of behavioural contrast have been made jn situations of discrimination 
training with errors, i.e. a procedure that allows many unreinforced responses (errors) to occur 
in the presence of S-. When the subjects are trained by an errorless procedure, no behavioural 
contrast is recorded (Terrace, 1963 a, b). This difference in performance following discrimination 
learning with and without errors has been recently reported in normal adults (Terrace, 1974). 

Errorless discrimination training has been accomplished with developmentally retarded 
subjects (Sidman & Stoddard, 1967; Touchette, 1968; Lambert, 1974). However, most of 
Terrace's experiments on the differences between discrimination learning with and without errors 
have never been replicated with retarded subjects. In order to extend the findings from the 
laboratory to the design of educational programmes, one must have a better knowledge of the 
mechanisms underlying the errorless procedures. 

The present experiments were designed to study sequential contrast effects with retarded 
subjects during discrimination learning with and without errors. 
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Experiment I 

Method 

Subjects. Four institutionalized retarded adults served as subjects. Subjects GB (CA = 30:3 yr; IQ = 34) and 
RO (CA - 25:1 yr; IQ = 30) had previous training in the experimental room in single and multiple schedules 
of reinforcement. Subjects RK (CA = 34:9 yr; IQ = 47) and EP (CA = 43:10 yr; IQ = 46) were experimentally 
naive. 


Apparatus. An experimental room (6 ft by 6 ft) with a periscope for behaviour observation (Asano & 
Barrett, 1964) was used. A wooden panel situated in front of the subject contained a Lindsley manipulandum 
and a stimulus-presenting device mounted above the manipulandum. Four inches to the right of the plunger 
was a reinforcement tray. The subject's responses were reinforced with pennies. Pennies earned were used 
to buy consumables after each session. During a reinforcement cycle, the room and the stimulus light were 
shut off and the tray was illuminated for 3 sec. Responses during this magazine cycle were not reinforced. 
White noise was continuously present in the room. Standard programming and recording equipment was 
housed in an adjacent room. 


Procedure. The subjects were presented with a Mult (VI 20 sec-EXT) schedule. During the VI component, 
the stimulus-presenting device was illuminated by a white light (S--) and pennies were delivered for 
responses on the average of | per 20 sec. During the extinction component, the light was off (S—) and no 
reinforcement was programmed. The components each lasted 2 min and were presented in a random 
sequence with one restriction: there were no more than two successive presentations of the same stimulus. 
Consecutive components were separated by a period of 5 sec during which the light in the room was turned 
off. Responses during this interval had no programmed effects. Session duration was 30 min. Two different 
procedures of discrimination training were used. 

(a) Errorless training (subjects GB and RK). The method used was similar to Terrace's 'early progressive 
introduction’ of S— (Terrace, 1963 a). Discrimination training started during the first expenmental session. 
During the two first sessions, the duration of S-- was 2 min. The duration of S— was progressively increased 
according the following schedule: 1, 3, 5, 7, 10, 15, 20, 30, 40, 60, 80, 100 and 120 sec. At the end of the 
second session, S+ and S— randomly alternated and both components lasted 2 min. Each subject received 
ten sessions. 

(b) Trial and error training (subjects RO and EP). Discrimination training started during the first 
experimental session. However, the duration of S— was initially at its maximum value, 2 min, S+ and S— 
alternated in a random sequence with a duration of 2 min for each component. Each subject received ten 
Sessions. 


Results 


Figure 1 shows the response rate per session emitted by the four subjects during presentations 
of S+ and S— components. The two subjects of the early progressive group made virtually no 
responses to S— during the ten sessions. Their results are in contrast with those of the subjects 
of the trial and error group. For subject RO, the response rate during S— averaged 40 per cent 
below the response rate during S+ for the first seven sessions. From session 8, the rate of errors 
decreased to reach a zero level at session 10. For subject EP, the error rate averaged 70 per cent 
below the rate during S-- for sessions 1 to 5. From session 6, subject EP presented a stable 1 
per cent rate until the end of the experiment. For subject RO and EP, the difference between the 
rates during S+ and S— did not appear to vary in any specific patterns. 

Figure 2 shows the evolution of sequential contrast effects for each subject. Sequential 
contrast effects are defined as greater response rate during S+ following an S— component than 
during S4 following another S-- component. Faced with the high intra-individual rate variability, 
an analysis of the contrast effects was conducted in per cent difference between relative rates 
rather than in difference between absolute rates. The data presented in Fig. 2 are per cent 
differences between response rates during S+ components that followed S— components (S+/S—) 
and response rates during S+ components that followed other S+ components (S+/S+). The per 
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Figure 1. Responses per minute during S+ (@) and S— (O) for each subject under both training procedures. 
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Figure 2. Per cent difference between response rates during S+ components that followed S— components 
and response rates during S+ components that followed other S+ components for each subject under both 
training procedures. 
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cent difference was calculated by subtracting the rate for S+/S+ for each session from the rate 
for S+/S— and dividing the difference by the rate for S+/S+ (O'Brien, 1968). This proportion 
was then multiplied by 100. A positive difference indicates the occurrence of sequential contrast 
effects. 

Figure 2 shows that sequential contrast effects were obtained under both training procedures. 
All the subjects presented a positive per cent difference in rates during the early sessions. For 
subject GB, who acquired the discrimination without errors, the magnitude of the effects was 
greatest during the first four sessions. By session 7, the per cent difference in rates was below 0. 
The evolution was similar for subject RK with the exception that the magnitude of the effects 
was lower. For the two subjects of the trial and error group, the sequential contrast effects were 
present during the first three sessions. The effects disappeared during three sessions for subject 
RO and during two sessions for subject EP. In both cases, a positive per cent difference in rates 
was present again during the last sessions. 


Discussion 


During the experiment, the two subjects trained with errorless procedure acquired the 
discrimination with virtually no responses to S—. This offered further evidence that retarded 
subjects can be taught a discrimination without errors. For the subjects of the trial and error 
group, the data show evidence of stimulus control over rate of responding from the beginning of 
the experiment. For all sessions, response rate during S+ was higher than response rate during 
S—. However, a low rate of responding during the presentation of the extinction component was 
only observed in the last sessions. The results demonstrated sequential contrast effects in all 
subjects during the acquisition of the discrimination with and without errors. The only difference 
between both procedures was in the fact that contrast effects seemed to be less transient in the 
trial and error group. For the subjects who acquired the discrimination under a procedure of trial 
and error, the presence of sequential contrast effects is not surprising. As a matter of fact, such 
effects have been found in other experiments, with pigeons (Reynolds, 1961; Terrace, 1963 a, 
among others), normal children (Waite & Osborne, 1972) and retarded adolescents (O'Brien, 
1968). As the results are at variance with theoretical expectation (Terrace, 1966), one might argue 
that the effects which we obtain would not be achievable with an extended number of subjects. 
Experiment II covers more data. 


Experiment IT 


Experiment II used a different apparatus. The procedure was first designed to study the effects 
of extinction on the performances of subjects trained with and without errors. The results of 
extinction will be published elsewhere (Lambert, 1975). The data below are unpublished and 
concern the acquisition of the discrimination. 


Method 


Subjects. Eight retarded boys, non-institutionalized, served. Ages ranged from 7:3 to 13:8 yr. Their IQ 
scores, calculated by the WISC varied between 45 to 53. They were divided into two groups matched by IQ 
and chronological age. 


Apparatus. The experimental room, (15 ft by 10 ft) contained a stimulus-presenting device consisting of a 
square Plexiglas panel (3 in) mounted in an aluminium plate. The panel, or key, was hinged at the lower edge 
and in contact with two microswitches at its top. The stimulus projection apparatus was mounted behind the 
panel. A shutter interupted light from the projector during the interval between trials. Each slide contained 
the stimulus for one trial. Below the panel was a photocell keyed by a hole in the lower portion of the slide. 
This photocell served to decode S+. Light falling on the photocell was not visible to the subject. Every 
correct response was reinforced by a token delivered by a Gerbrands Token Dispenser. Tokens were visible 
to the subject. At the end of the session earned tokens were exchanged for candies or small toys. 
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Table 1. Number of responses to S+ and to S- for each subject 











Subject No. Responses 
Procedure no. to S+ to S- 
Errorless 1 841 0 

2 902 0 

3 760 0 

4 1070 0 
Trial and error 5 1066 89 

6 1379 180 

7 939 146 

8 1111 195 - 








Procedure. The subjects were preserited with a Mult (FR 20-EXT) schedule. During the FR component, the 
panel was illuminated by a triangle with apex straight up (S+) and responses were reinforced on a fixed ratio 
schedule. During the extinction component, the panel was illuminated by a triangle with apex straight down 
(S—) and no reinforcement was programmed. Each component lasted 1 min and components were presented 
in a random sequence with a restriction: there were no more than two successive presentations of the same 
stimulus. In order to minimize the probability of reinforcement of responding to S— by the subsequent 
presentation of S+, responses during S— delayed the onset of the next S+ component until 10 sec without a 
response to S— had occurred. Two different procedures of discrimination training were used. 

(a) Errorless training (subjects 1, 2, 3, 4). S+ was immediately introduced at full intensity of 1 min 
duration. S— was changed from barely discernible dots of 1 sec duration to lines identical with S-- of 1 min 
duration. The intertrial interval was 20 sec. Each subject received 40 trials. 

(b) Trial and error training (subjects 5, 6, 7, 8). S-- and S— were immediately introduced at full intensity 
at their maximum values: 1 min. The intertrial interval was 20 sec. Each subject received 40 trials. 


Results 


Table 1 shows the number of responses. Subjects of the programmed group acquired the 
discrimination without errors. Their performances are in contrast with those of subjects of the 
trial and error group. 

The analysis of sequential contrast effects for each subject was conducted on the same model 
as in Expt. I. Figure 3 shows that seven subjects presented a positive per cent difference in rates 
during discrimination training: three subjects of the errorless group and four of the trial and 
error group. 

These results are an extension of data obtained in Expt. I. The occurrence of sequential 
contrasts effects in three subjects who learned the discrimination without errors are consistent 
with Rilling & Caplan’s (1975) observation in pigeons. These authors obtained behavioural 
contrast for four birds out of eight though the discrimination was acquired without errors. 


Discussion 
The occurrence of sequential contrast effects in subjects who received a discrimination training 
under an errorless procedure is the first demonstration of this phenomenon in retarded subjects. 
Such results would not be predicted from those reported by Terrace who observed no sequential 
contrast effects in errorless learning with pigeons (Terrace, 1966, 1973) and normal adults 
(Terrace, 1974). Our findings must be discussed in terms of Terrace's theory of errorless learning 
and in light of recent results obtained in errorless training with pigeons. 

In a trial procedure the stimuli acquire some functions as by-products of the method of 
training. These by-products include, among others, peak shift, responses to an escape key 
that removes S—, extinction-induced aggression and behavioural contrast. In two extensive 
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Figure 3. Per cent difference between response rates for each subject under both training procedures. 


reviews of these by-products, Terrace (1966, 1972) argued that they are obtained only after 
discrimination training with errors, and that they disappear with extended training. The basis of 
Terrace's explanation for the occurrence of these by-products is that they are a manifestation of 
emotional responses generated by the aversive properties of S—. The aversiveness of S— is 
produced by the non-reinforced responses emitted during the presentation of the stimulus. The 
responses to S— are thus responsible for the occurrence of by-products. However, recent 
experiments in pigeons have thrown doubt on this theory by demonstrating the occurrence of 
by-products in errorless conditions. Rilling, Richards & Kramer (1973) demonstrated that 
non-reinforced responses to S— are not the crucial factor contributing to the aversiveness of S—; 
the negative stimulus can acquire aversive properties following training without errors. In further 
studies, Rilling & Caplan (1973, 1975) demonstrated that aggressive responses and behavioural 
contrast could occur in pigeons during an errorless discrimination. Moreover, Karpicke & Hearst 
(1975) demonstrated the occurrence of an incremental generalization gradient around S— in 
pigeons errorlessly trained. This gradient indicates that S— acquires an inhibitory control during 
discrimination training. Their results are in contrast with those of Terrace (1966) who argued that 
a flat gradient was obtained around S- in errorless learning. The results of the present studies 
extend Rilling & Caplan’s observations and question Terrace’s theory about the role of errors. 
Since all the retarded subjects of the errorless groups presented virtually no responses to S-, 
one can hardly follow Terrace's explanation in terms of emotional responses generated by 
errors. Our results suggest that other factors are involved in the formation of by-products. A 
basis for further experiments is given by Rilling & Caplan (1975) who demonstrated that the 
procedure of introducing S— and the schedule of reinforcement during S+ are more important 
than responses to S— in the development of by-products of discrimination learning. 

As we have previously noted, the application of errorless procedures with developmentally 
retarded subjects requires a careful study of the mechanisms underlying these training methods. 
Fürther studies are needed to determine if sequential contrast effects during a discrimination 
training without responses to S— are only one exception to Terrace's theory, or whether other 
by-products may also be obtained under conditions of errorless learning. 
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The influence of language training on seriation of 5—6 year old children 
initially at different levels of descriptive competence 


Margery Heber 


Among explanations of the relation of language to cognitive development Bruner, Olver & Greenfield, 1966) 
and Piaget (1970) provide a contrast. Bruner has held language to be the major instrument of external 
influences moulding thought, while Piaget views it as merely one medium which represents developing 
thought systems. The present work is based on the language training experiment in seriation of 
Sinclair-de-Zwart (1967). Two groups of children comparable in seriation but who were initially at different 
levels of competence in the appropriate use of the relevant descriptions e.g. bigger/smaller, were trained in 
this use and then compared for subsequent progress in seriation, immediately and two weeks later. 
Significant progress followed for experimental subjects which was notably more rapid among those with 
prior command of the descriptions (middle class children). This advance was eventually equalled by a lower 
working class group. These results taken together appear to suggest an interactive process uniting 
speech-in-context with cognition. 





The present study began by considering the controversy between Bruner et al. (1966) and Piaget 
(Inhelder & Piaget, 1964) followed up by Sinclair-de-Zwart (1967) concerning the role of language 
in the cognitive development of children between the ages of 5 and 7. Bruner et al.’s account 
(1966) suggested that once the child has mastered speech this then becomes the major instrument of 
environmental and social influences shaping thought. Piaget states an apparently opposing view. 
He emphasizes that the origin of logico-mathematical ideas resides not in language but in the 
child's earliest practical activities. Each successive system of actions represents an increasing 
complexity of relations which gradually coalesce into logico-mathematical groups (Piaget, 1962, 
1970; Inhelder & Piaget, 1964). In his recent writings Piaget appears almost to dispense with 
external social influences and the speech which mediates them in favour of the child's own 
cognitive constructions. Although such a cursory account of these theories must do them an 
injustice, this problem has traditionally been viewed from two opposite extremes of a possible 
continuum. On the one hand speech has been regarded as the mould of developing thought, on 
the other hand thinking has been considered to be otherwise constructed, in which case speech 
was only its vehicle. The impact of this dichotomy can be seen in education (see, for example, 
Bereiter & Englemann, 1966; Furth, 1966). It seems likely that neither model fits the case. Not 
only do they overlook the changing complexities of cognitive and linguistic development but also 
the interactions between them which must similarly change over time and in different contexts. 
Thus warned against overgeneralizing it may seem sensible as an initial strategy to deal with the 
problem at a specific juncture in cognitive and linguistic development. We should choose a 
period when speech is well advanced and apply it to a task sufficiently difficult to extend the 
child beyond the range of his immediate visual imagery. If we also consider differences between 
children whose home backgrounds are variously favourable to the development of objective 
analytical speech usage and who may in consequence be at different levels of cognitive growth 
then we shall perhaps gain further insight into the relationship we wish to explore. 

Piaget has relied for experimental support for his view almost exclusively upon the work of 
Sinclair-de-Zwart (1967) although admittedly he sounds a note of warning when he says that her 
experiments are merely a beginning in the study of this complicated relationship (Piaget, 1971, 
p. 96). To date it seems this work has not been repeated. Sinclair-de-Zwart was attempting to 
compare levels of thinking with what she called ‘their linguistic subsystems’ (Sinclair-de-Zwart, 


86 Margery Heber 


1967, p. 11) and to counter Bruner's (1964) claim as to the primacy of specific language at the 
transition to operational thinking. She found that subjects who were pre-operational in 
conservation of continuous quantity and in seriation, according to Piaget’s scheme, were in each 
case limited to uncoordinated descriptions (e.g. ‘this crayon is big’, and of the same object, 
‘this crayon is thin’). When describing size differences they gave global descriptions (e.g. ‘big’ 
or ‘little’). Utterances which unite compensating dimensions when conserving quantity or which 
reversibly compare sizes (e.g. for the former, ‘this crayon is tall but thin’, or the latter, 
‘bigger/smaller’) were the prerogative of children who understood the underlying logic of 
conserving quantity or of uniting asymmetrical size relations. In tasks of comprehension all 
children were successful. She bases her contention that language cannot mould thought 
principally on her attempt to train pre-operational subjects in the expressions typically used at an 
operational level. In two experiments involving conservation of continuous quantity and seriation 
she claims to have found negligible progress in the development of logical thinking consequent 
upon such training. For her seriation training experiment Sinclair-de-Zwart selected 23 Genevan 
pupils at ‘pre-operational’ and ‘intermediate’ levels of seriation according to Piaget’s scheme 
outlined here in Fig. 1. After three sessions of training in the appropriate use of the relevant 
descriptions (e.g. ‘bigger/smaller’) they were re-tested for progress in seriation the next day and 
two weeks later. The subjects, we are told, were all average or above average in intelligence and 
between 5 and 6 years of age. Presumably they were at a comparable linguistic level initially. 
There was no control for effects of training. Testing procedures involved eliciting the description 
of a series, testing for comprehension, and then giving the seriation tasks. As to the first, the 
child was asked to describe a set of rods each of a different length assembled in order of size by 
the experimenter, first as to the whole configuration and then the relations of individual elements 
in ascending and descending order. Comprehension was tested by response to instructions in this 
context. From Sinclair-de-Zwart’s account of the seriation tasks we glean that the child was 
asked to construct such a series himself from a jumbled pile of rods and also either to interpolate 
a second series of rods of intermediate lengths or to construct a series correctly which was 
screened from view by selecting each element in order of size. It appears that the criterion of 
‘operational success’ was efficiency in either of the last two tasks. Post-tests replicated the 
pre-test. The account of Sinclair-de-Zwart’s language training is more thorough. It is adhered 
to with only minor additions in the present work. She stressed training in the description of the 
middle item of three (‘the description of 3 elements"), namely where A >B >C, the child 
learned to describe B as both « A and » C. However the child's ability to make such a description 
was not fested by her either before or after training. Her outline of parallel linguistic and logical 
categories is set out in Fig. 1. Although she notes a slight tendency for the former to precede the 
latter in development, she regards this as an ‘automatic’ manifestation. She places special 
emphasis upon the fact that learning difficulties were probably logical rather than linguistic. These 
were learning to substitute bigger/smaller appropriately and, above all, learning to describe B in 
respect of A and C - ‘the middle of 3 elements’. Only 12 subjects mastered this last description. 
As regards progress in seriation, according to Sinclair-de-Zwart, 18 of the 23 subjects made 
some logical advance, seven of whom reached an operational level of seriation but only three 
of these made substantial progress across substages. Such progress is thought by her to be of 
little consequence. 

That Sinclair-de-Zwart had set up a test of counter-views at a specific and sufficiently 
advanced cognitive and linguistic juncture in development provided some of the necessary 
ingredients for studying the problem in hand, but her methods lacked precision. This left the 
question open for further investigation. Points of uncertainty in her study which appear to be of 
particular importance are: lack of controls for effects of training, or of comparisons of these 
effects upon subjects at different initial levels of descriptive competence; insufficient detail about 
seriation procedures and coding, with consequent ambiguity as to behavioural criteria for 
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allocation to substages and as to measures of progress; the omission from the tests of what may 
be the most critical description - ‘the description of 3 elements’ - thus precluding comparison of 
this with the final transition to an operational seriation level; and doubt as to whether language 
training was strongly didactic or open-ended in quality. Finally, no results for the second 
post-test were given on the grounds that they did not differ from those of the first. Some 
differences must have occurred if detailed protocols are to be believed, and since no statistical 
tests are made at any point, what constitutes a ‘significant difference’ remains a matter of 
opinion. 

The present experiment therefore attempts to repeat the language training experiment of 
Sinclair-de-Zwart in seriation and in addition to compare two groups of children one of which 
(lower working class, LWC) may possibly be relatively less efficient than the other (middle class, 
MC) in the appropriate use of the specific comparative expressions 'this one is bigger/smaller 
than that one'. According to Bernstein (1973) the LWC child is likely to be confined to a 
restricted code of language use whose function is to preserve the social structure, whereas MC 
children have also at their command wider uses of language, in particular, one which functions 
to analyse the world of objects and their relationships through independent thought. This kind of 
difference was found by Robinson & Rackstraw (1972) in answers of mothers to typical 
questions of five year olds and in answers given by seven year old children. The present author 
(Heber, 1974) also found similar social class differences in the questions asked by seven year old 
boys. These findings were consistent with Bernstein's idea that LWC children are not orientated 
to the use of language as a tool for discovery and analysis of cause and effect. Thus it is 
probable that they are at a similar disadvantage in the use of expressions required to describe a 
logical structure such as a series of asymmetrically arranged elements. Children who have this 
disadvantage may perhaps be expected to advance in this logic as a consequence of specific 
language training according to the degree of importance of such language at this point in the 
development of the particular logical structure. Thus according to two theoretical extremes, the 
one assuming that language is a major influence in the development of logical thinking, the other 
that it is irrelevant to this, we would expect the LWC group either to make significant progress 
in seriation or not, whereas the MC group would not be affected in either case. Probably, as 
oversimplifications, neither prediction is likely to be entirely correct. In particular, we should 
note that the first explanation assumes 'language' to have a function independent of context, a 
feature which is clearly not characteristic of the descriptions in question here. However, it is 
hoped that in this experimental situation differential patterns of progress in logical thinking 
following training in language will throw some light on the nature of the relation in question. 


Specific analysis of the problem 


According to Piaget, operational seriation is the logical level of understanding of the organization 
of elements in a series arranged in asymmetrical transitive relationship along some dimension. 
The dimensions in the present experiment are size and length. In the Piaget/Sinclair-de-Zwart 
analysis this appears to require an understanding by the child of both ascending and descending 
relations which characterize this kind of arrangement, and also, that each item occupies a unique 
position within it as both bigger than those below and smaller than those above it. For this level 
of understanding Piaget and Sinclair-de-Zwart, as has been noted, believe that either selection by 
size or interpolation of extra elements are sufficient alternative criteria. The present author, 
following Woodward (1974) has adopted a single criterion. This requires the insertion of an extra 
item in a covered series by systematic comparison with items on both sides of the point of 
insertion. To give a definitive description of this set of relations one must first select the relevant 
attribute (viz. size), next the type of relation (viz. difference) and then the specific nature of this 
difference (viz. that it is asymmetrical throughout). The comparative terms bigger/smaller must 
be used interchangeably as required in order to indicate the general characteristic of the series 
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Sinclair-de-Zwart' s analysis 


Descriptions given by 
the subject. 


I. ‘dichotomy’, 
e.g. ‘bit’, ‘little’. 


II. ‘trichotomy ’, ‘e.g. 
‘big’, ‘middle-sized’, 
‘little’. 


III. ‘labelling’, e.g. 


‘mummy’, ‘daddy’, 
‘baby’, etc. 


IV. ‘one-way’ use of 


comparative terms, e.g. 


‘bigger’ or ‘smaller’. 


V. Interchangeable and 
appropriate use of 
‘bigger/smaller’ 


(‘description of 3 
elements’ only used in 
training by 
Sinclair-de-Zwart). 


Seriation: levels of 
understanding ordinal 
size relations, based 
on Piaget. 


I(a) No success in 
seriation (pre- 
operational). 


I(b) Small un- 
coordinated series 
(pre-operational). 


II. Success through trial 
and error 
(intermediate). 


III. Competent size 
ordering and correct 
interpolation of inter- 
mediate length items 
or selection by size 
(operational). 


Present analysis 


Descriptions: comparable Seriation: ın 6 tasks, 

levels analysed for behaviour featuring 

both E and S. degrees of competence 
in ordering items by 
size: type of con- 
struction and amount 
of self-correction; 
type of selection; 
sequence of comparisons 
for interpolation. 


(a) Relevant attribute, 
e.g. ‘size’. 


(a) Random selections, 
placement and com- 
parisons. No 
corrections. 

(b) Slight evidence of 
grouping by size, e.g. 
big/little, also spot 
comparisons. 


(b) General relation, 
e.g. ‘difference’. 


(c) Descriptive use of 
terms, e.g. ‘big’, ‘little’; 
‘big’, ‘middle-sized’, 
‘little’; ‘mummy’, 
‘daddy’, ‘baby’. 

(d) ‘one-way’ use of com-(c) Rough selection by 
parative terms, e.g. size; laborious but 
‘bigger’ or ‘smaller’, correct assembly of 
i.e. terms not used items in size order. 
interchangeably and For insertion, some 
appropriately. sequential comparisons. 


(e) Interchangeable and 
appropriate use of 
comparatives, e.g. 
‘bigger/smaller’. 


(d) Effective achievement 
by trial and error. Some 
sequential comparisons 
and occasional com- 
parisons on both sides 
of insertion. 

(‘description of 3 
elements’ tested and 
separately analysed 
in present study). 


(e) Competent arrangement 
and selection in order 
of size and insertion 
of extra item into 
covered series by 
systematic comparisons 
with elements on both 
sides of point of 
insertion - ‘the double 
comparison strategy’. 


Figure 1. Seriation and the related description: Comparison of Sinclair-de-Zwart’s and present analysis. 


(viz. A> B» C therefore C < B < A). Finally these terms must satisfactorily describe the 
necessary simultaneous relations of each element to those on either side of it (viz. where 
A>B>C, B is both » C and « A ‘the middle of 3 elements’). Clearly then the appropriate 
and definitive description of this configuration out of context is impossible. Figure 1 shows 
Sinclair-de-Zwart's analysis of descriptions according to the developmental sequence she accords 
them placed opposite levels of understanding of seriation in Piaget's scheme to which she claims 
they are developmentally parallel (Fig. 1, columns 1 and 2). In addition are listed the criteria for 
a definitive description appropriate to seriation levels as discussed in this paper (Fig. 1 columns 3 


and 4). 
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Method 

Design 

Two groups of children were selected and matched on tasks of seriation as being at ‘preoperational’ and 
‘intermediate’ levels. In each group there were ten experimental subjects and ten control subjects making 40 
in all. The independent variable was social class. Experimental children in each group were given specific 
language training in the use of comparative terms as typically employed by operational seriators and all 
subjects were tested subsequently in seriation tasks on two occasions, the first within a day or two of 
training and the second two weeks later. Tasks of description and comprehension were included in the 
pre-test and post-tests. 


Subjects 


The subjects were children of semi- and unskilled workers (designated lower working class - LWC) drawn 
from a school on a Council Estate, and children whose parents were professional or clerical (designated 
middle class - MC) from a school drawing from a mainly middle class residential area. Both schools were in 
the city of Southampton. Four of the MC group (2 experimental and 2 control) came from a second school 
of mixed social class intake. Despite differences in catchment areas all the schools employed similar 
child-centred methods. Selection for interview was in alphabetical sequence selected by social class, 
commencing with six year olds and lowering the age as necessary. Operational seriators were dropped while 
the remaining 'pre-operational' and ‘intermediate’ stage children were alternately allocated to experimental 
and control groups. The mean ages arrived at in this way were: MC Es mean = 5:7; Cs mean = 5-6; LWC Es 
mean = 6-1; Cs mean = 5:9. All subjects were boys and were without obvious sensory, intellectual or 
emotional handicaps. 


Materials 


I. For seriation tasks. These materials and tasks are an extension of Piaget's and are based on those of 
Woodward (1974). They differ from the original techniques in the number and variety of tasks, thus 
providing varied dimensions. The last three covered sets, where one extra item must be interposed in the 
existing series by means of single sequential comparisons with items in that series, are in lieu of Piaget's 
interpolation task and/or Sinclair-de-Zwart's ‘screen test’, although this last is included as task 3 here. 

1. Loose sets. Three sets of ten rods, made of wooden square dowel, each set of a different colour (a) 
yellow, (b) red, (c) blue, each differing in overall size dimensions and regular or irregular differences in 
length between items. Two boards upon which arrangements of rods could be made. 

2. Covered sets. Three grooved boards of natural colour plywood with raised equidistant divisions of matt 
black. Three sets of ten rods with similar types of difference between sets and items as above, the rods to 
run in the grooves. For each set, nine extra rods of contrasting colour to the first ten, and of intermediate 
lengths so that when inserted they completed an asymmetrical series in each case. Plastic laminated 
cardboard covers for each board. An extra board and set with five main rods and four insertions for 
demonstration. These sets were: (d) blue and yellow with yellow cover, (e) green and white with purple 
cover, (f) red and blue with blue cover. The extra demonstration set was white and red, the cover red. 
Colours were chosen to give pleasant variety but also helped in identification. 


II. Training materials. Two sets of ten pairs of model slippers (one set blue, one set green) backed with 
glass paper, of constant proportionate increase in length and breadth from 2:5-16 cm in length. One board 
faced with beige brushed nylon 9 cm by 30 cm. 


Procedure 


All children selected for the experiment were interviewed individually on six occasions each session lasting 
from between 10 to 40 min approximately, depending on individual needs. These sessions comprised: (i) a 
pre-test for selection, operational seriators were dropped at this point; (ii) three ‘language’ training sessions; 
(iii) two post-tests, the first approximately one day after the last training session, the second two weeks later. 
Children in the control groups had similar treatment omitting the training sessions. Sessions were tape 
recorded except where seriation tasks were confined mainly to action, when written records were made by 
the experimenter. Children did not appear to be distracted by either type of recording. 
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Pre-test 


(i) Description. The loose series set (a) was assembled by the experimenter and the child was asked for a 
description, first of the whole (e.g. ‘a staircase’) and next of the relations of individual elements in two 
directions (e.g. ‘bigger/smaller’) the order being varied. Three items were then isolated, A> B >C, and the 
child was asked to describe B in relation to A and C. This was termed ‘the description of-three elements’. 


(ii) Serlation. The child was asked to perform six seriation tasks after he had been shown a completed 

series which had previously been constructed out of sight by the experimenter, using set (a). These were: 
task 1 arranging the jumbled elements of ‘loose’ set (b) ‘to make a staircase’; task 2a similar procedure 
using set (c); task 3 selecting elements from jumbled set (a) ‘in the right order to make a staircase’ and 
handing them one by one to the experimenter for assembly behind a screen (Sinclair-de-Zwart's ‘screen test’); 
task 4 using the extra covered demonstration set, the experimenter illustrated the correct insertion of one extra 
rod, comparing it sequentially with rods one at each end of the covered items. The cover was then removed 
to show the correct position of the extra item in the series. Then, using set (d), the subject was asked to insert 
one extra item as in the demonstration procedure and encouraged to ‘look at’ as many of the rods from the 
existing series as he liked, but only one at a time, before he chose the point of insertion. The cover was then 
removed and the child encouraged to judge his degree of success and allowed extra trials if he desired. 

Tasks 5 and 6 were similar to task 4 using covered sets (e) and (f) but without demonstration. 


(iii) Comprehension. The child was asked to respond to a set of commands which required comprehension of 
the comparative terms ‘bigger/smaller’ using rods of different lengths. 

For descriptions and for seriation tasks, prompting was varied according to individual needs in order to 
ensure that the child understood what was required of him and was responding as far as possible at his 
optimum level. Children were always asked to judge their degree of success in each of the seriation tasks 
and encouraged to make any corrections they deemed necessary. 


Language training 

Following Sinclair-de-Zwart, practice was given in the appropriate use of the expressions 'this one is 
smaller/bigger than that one’. This description referred to a single set of slippers previously set out by the 
experimenter in size order. Each item in relation to the next was described in ascending and descending 
order. Then the experimenter added the pair to each slipper in the series and this correspondence was noted 
by the child. Finally the child was led to describe ‘the middle 3 elements’ e.g. A» B >C, but before B alone 
was related to A and C, a middle pair B and B' were included and the child was asked to describe B in 
relation to A, and B' in relation to C. These exercises were repeated until the child had mastered the 
descriptions concerned. This level of competence was achieved at some point during the three training 
sessions by all subjects, care being taken to maintain interest throughout. As far as possible in this 
experiment descriptions were elicited by open questioning, a method which may allow the child to 
incorporate his own utterance appropriately and may thus be in keeping with Piaget's theory of 
‘self-regulation’. 


Post-tests. 
Both the immediate and the delayed post-tests replicated the pre-test. 


Analysis of data 

Seriation 

The types of behaviour chosen as representative of the child’s understanding of size order 
relations were: his strategies of selection and comparison of rods; whether corrections were 
made and whether these were spontaneous; the type of achievement and the child’s judgement 
of this. According to this framework behaviour was classified into five categories ranging from 
less to more competent for the first three and last three tasks (see Fig. 1). In tasks 1-3, this 
ranged from random selection and two-way grouping of rods, e.g. ‘big’, ‘little’, the child being 
apparently satisfied with this achievement, to correct spontaneous ordering by size. In tasks 4, 5 
and 6 the first category of behaviour consisted in arbitrary interpolation, the child making no 
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comparisons despite prompting and being apparently unable to recognize errors. The most 
advanced category was that thought characteristic of an operational understanding of seriation 
namely, that the insertion rod should be systematically compared with those on either side of it 
before placement, this being the only logical means of ensuring success. This was termed ‘the 
double comparison’ strategy. 


Descriptions 

These were judged as to whether they were contextually appropriate. They were categorized 
according to the specifications mentioned in Fig. 1, namely at five levels, from a general 
description of the configuration to a precise and appropriately reversible account of size 
relations. Differences from Sinclair-de-Zwart’s analysis can be seen from this figure; for 
instance, items from her middle categories were too sparse in the present data to justify 
separation here. It also seemed important to analyse and count the utterances of both the 
experimenter and the child since this dialogue might reflect differences in ‘language use’ 
between groups. Descriptive units were semantic and functional in the sense that each attempt to 
describe a particular relation in the series of rods, whether grammatically complete or not, was 
counted as one unit. 


Treatment of results 


Individual levels in seriation tasks and descriptions were scored on a five-point scale in each case 
(1-5, or a—e) and these levels were compared before and after training between experimental and 
control subjects within each group (MC and LWC) and also between these groups using the 
Mann-Whitney U test. Advance between sessions was assessed for all groups using the 
Wilcoxon matched-pairs signed-ranks test. Progress in seriation was judged by counting the 
changes from initial score in the five-point scale whether positive or negative. The percentage of 
moves which could theoretically be of 1-, 2-, 3- or 4-step size across the scale was calculated 
for each of these amounts respectively. Descriptive competence was measured by the relative 
number of responses at level *e' taken as a percentage of the total number of prompts for each 
individual. 


Results 
Analysis of seriation tasks 


Consistent results were obtained from the six tasks: (product moment correlations of tasks 1-3 
were: 1 and 2, r= 0-86, P< 0-001; 2 and 3, r= 0-58; 1 and 3, r= 0-54, P< 0-01. Tasks 4—6: 4 and 5, 
r — 0:94; 5 and 6, r=0-95; 4 and 6, r= 0-84, P< 0-001. Tasks 1-3/4-6 summed: r= 0-54, P< 0-01; 
d.f. = 38 - initial scores for 40 subjects). Progress between sessions was fairly uniform with little 
fluctuation. Initial scores were represented at both ‘pre-operational’ and ‘intermediate’ levels, 
there being equal progress from all. Overall comparison of initial seriation levels between groups 
showed no difference (MC/LWC Es U= 45, n.s.; Cs U- 27.5, n.s.; MC Es/Cs U.- 27-5, n.s.; 
LWC Es/Cs U- 41, n.s.). 


‘Language’ ; descriptive levels compared 


(1) Median percentage of responses at level ‘e’ to prompts, for both groups were: Es pre-test 
LWC 55-35, MC 156-3; post test I LWC 462-5, MC 750; post-test II LWC 412-5, MC 587-5. Cs 
pre-test LWC 22-25, MC 200; post-test I LWC 145-45, MC 483-3; Post-test II LWC 275, MC 
493-3, 

(2) Initially: the MC were clearly more competent than the LWC group (MC > LWC Es U=8, 
P<0-002; Cs U= 10, P< 0:002); experimental and control subjects were not different in the MC 
group (U — 40, n.s.) although LWC controls were less advanced than their experimental 
counterparts (U — 23, P< 0-05). 
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(3) After training: these comparisons showed that MC experimental as opposed to control 
subjects continued to be better, though not significantly so, than LWC subjects at the first 
post-test, but that this deficit had been dissipated by the second post-test (Post-test IMC/LWC 
Es U=27, P« 0-10; Cs U= 22-5, P< 0-05) (post-test II Es U=37, n.s.; Cs U=28, n.s.). 

(4) Between sessions: progress from pre-test to post-test I was significant for both groups of 
experimental subjects and their controls and in both cases there was no further significant 
improvement in descriptions from post-test I to II (at post-test I Es LWC T=0, n10, P<0-005; 
MC T=0, n10, P<0-005; Cs LWC T=1, n8, P«0-01; MC T=3, P<0-005) (at post-test II Es 
LWC T= 16, n9, n.s.; MC T= 25, ni0, n.s.; Cs LWC T= 14, n10, n.s.; MC T= 24, n10, n.s.). 


Progress in seriation 


After training the changes between levels 1 to 5 which theoretically could be moves of a 
one-step to as much as a four-step type between substages, and which were summed across the 
six tasks for individuals, at best consisted in progress from levels 2 to 5 in the first three tasks or 
from 1 to 4 in the second three in two out of three tasks (‘3-step’ moves). But ‘one’- and 
‘two-step’ moves were usually the case. The percentage of all moves which were of a 1-, 2-, 

3- or 4-step type respectively were: Es MC 61, 30, 9, 0 per cent; LWC 74, 22, 4, 0 per cent; Cs 
MC 79, 20, 0, 0 per cent; LWC 88, 12, 0, 0 per cent. No subjects used logical strategies 
systematically, i.e. level 5 in tasks 4, 5 and 6, and therefore none became ‘operational seriators’ 
according to the present criteria. But it is clear that systematic perceptual strategies of selection 
and comparison of rods were developing, whilst logical techniques were emerging on occasions. 
These were used by seven subjects, five experimental (3MC, 2LWC) and two MC controls. 
Mean score levels for the MC and LWC groups at sessions I, II and III for the first and second 
three tasks respectively were: I MC 2:3, 1-6; LWC 22, 1:6; II MC 3:6, 2-6; LWC 3-1, 1-8; III 
4-1, 2-8; LWC 3-6, 2-4. 


Comparisons of progress in seriation 

All experimental subjects made significantly better progress than controls (post-test I MC 

Es» Cs U=6-5, P< 0-001; LWC Es > Cs U=10, P< 0-001) (post-test II MC Es/Cs U= 47-5," 
n.s.; LWC Es > Cs U= 18-5, P< 0-01) (total MC Es» Cs U=5, P« 0-001; LWC Es» Cs, U=5S, 
P< 0-001). MC experimental subjects were in advance of the LWC at the first post-testing but 
were subsequently equalled by the LWC group of post-test II (post-test I MC > LWC U= 18-5, 
P<0-02; post-test II U= 40, n.s.; total U=31, n.s.). Comparisons of control subjects between 
groups were similar (post-test I MC > LWC U=22, P< 0-05; post-test II U = 30, n.s.; total 

U = 40, n.s.). 

Between sessions most progress was made by the MC group at post-test I, a level which did 
not alter significantly by post-test II (pre- to post-test I Es T=0, n10, P<0-01; Cs T- 3.5, n9, 
P « 0-05) (post-test I to post-test II Es T= 4, n7, n.s.; Cs T=3, n7, n.s.). Among the LWC 
progress continued between the three sessions (pre- to post-test I Es T=0, n9, P« 0-01; Cs 
T — 7.5, n5, n.s.) (post-test I to post-test II Es T=0, n10, P« 0-01; Cs T=2:5, n8, P< 0-025). 

The relation of ‘language’ to ‘logic’ is such as to link pre-test descriptive competence to 
post-test progress in seriation in each experimental group. This my be illustrated by correlating 
pre-test ‘level e’ descriptions to total seriation progress (MC rho = 0-61, P< 0.05; LWC 
rho = —0-03, n.s.). A similar pattern may tentatively be suggested for initial competence in 
describing 'the middle of 3 elements' which was possessed by the three MC experimental 
subjects who subsequently and in contrast to others began to use ‘logical’ - ‘double 
comparison’ — strategies in the insertion tasks. Two of these subjects began to use the ‘logic’ at 
post-test I, the third in two of the three tasks at post-test II. The two LWC experimental 
subjects, without having the initial description began to use this 'logic' each in two tasks at 
post-test II. Of the two MC control subjects who eventually used the ‘logic’, one initially had 
the description. j 
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Summary of results 


Two groups of children (LWC and MC), initially at different levels of competence in describing 
size relations were selected as comparable in seriation. Thus selected, mean chronological ages 
were six months less in the MC group (CA, MC 514, LWC 6). Experimental subjects in each 
group then had training in the appropriate use of the descriptions, e.g. ‘bigger/smaller’. Control 
subjects did not. All subjects were subsequently tested for progress in seriation. Differences in 
descriptive competence were most apparent in the relative number of prompts required to elicit 
appropriate descriptions. This was significantly higher in the LWC group who similarly needed 
longer training sessions than the MC children. Learning difficulties related to giving reverse 
descriptions appropriately but were most intractable for describing ‘the middle of 3 elements’. 
Nevertheless all subjects achieved the required descriptive competence by the end of three 
learning sessions and all were initially able to comprehend the comparative terms concerned. 
The type and amount of progress in seriation was assessed in six tasks where improvement in 
systematic strategies of selection and comparison of rods was clearly emerging. In this manner a 
minority of subjects advanced as much as two or three substages in one move on a five-point 
scale, although most subjects only progressed by one step at a time. All but a few unsystematic 
attempts by seven children (5 Es, 2 Cs) relied on perceptual judgements and thus according to 
the criteria set here, none fully reached an operational seriation level of understanding. Progress 
was significantly better among experimental groups than the controls, and MC subjects (to whom 
the appropriate use of descriptions was initially available) made significantly better progress than 
the LWC group immediately after training. This difference was equalled by the LWC two weeks 
later without further training. The characteristic pattern of progress for each experimental group 
respectively occurred among controls though significantly less in amount. The positive relation 
of initial descriptive competence and prompt progress in seriation appears also to occur for the 
‘description of the middle of 3 elements’ and the development of subsequent ‘logical’ seriation 
strategies. 


Discussion 


Detailed comparison of the present findings with those of Sinclair-de-Zwart is precluded by 
differences in method and design, but in general it is probably fair to suggest that the progress in 
seriation which followed ‘language training’ was similar in type and amount in both studies. The 
control groups used in the present study allow us to see that progress is significantly related to 
‘language training’ and differences in descriptive competence. Clearly the learning difficulties 
encountered in both studies were similar and this is germane to the problem if we agree with 
Sinclair-de-Zwart that they are logical rather than linguistic in nature. But here, we are free to 
shift from Sinclair-de-Zwart’s cognitive emphasis to an interpretation which includes combined 
linguistic and social influences in an active relation with the emerging logic of the individual. 
The present method of dealing with the problem of the relation of ‘language’ and ‘logic’ in 
development has been to compare progress in the latter upon the provision of ‘language training’ 
to subjects initially at different levels of descriptive competence.* In this circumstance it may be 
hypothesized that to the extent that ‘language’ and ‘logic’ are thought to interact in some way, 
initial presence of appropriate ‘language’ at each successive point of ‘logical’ progress might be 
the condition giving rise to more rapid effects from training in ‘language’ than an initial lack of 
such ‘language’. It is this theoretical position which best fits the results of the present 
experiment. Here it seems that the influence of ‘language’ in cognitive development, at least as 
it concerns seriation and the related description, is something more than a ‘medium’ or channel 
of representation, and less than a ‘mould’ of developing thought. Rather our results seem to 


* The terms ‘language’ and ‘logic’ are used here as short-hand to apply to the specific aspects of each 
with which we are concerned in the present experiment. 
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point towards the idea of some more complex interactive process of developing speech and 
cognition; one in which the developmental flow must preclude any static or wholly malleable 
features, either linguistic or cognitive. The changing flow of linguistic usage appears in the fact 
that comparative terms are comprehended by all subjects in some sense and differences in 
descriptive competence between groups lay in the precision of reference which was more or less 
easily elicited in dialogue over a period of time. In fact progress was occurring in all subjects 
including the controls. The control subjects progressed significantly less in amount than 
experimental subjects but preserved the timing pattern typical to each group respectively. Thus a 
characteristic descriptive-logical process was continuing under the influence of tests alone. But 
the *language training' was all in the speech mode, which is a flexible and rapid medium of 
reference and in this context constrains analysis of the particular essential relations. 
Furthermore, much of this referential focus must be derived from the nature of the dialogue 
itself which in turn may instigate and expedite conscious reflexion. Such reflexion would not only 
probe details of the constancy of relations in this particular logical organization but might tend to 
detach the descriptions from their specific context by generalization to all parts of a series and 
among perceptually different sets. The speech being dialogue, moreover, both draws from the 
child the precise references required and must also have the added force of objectivity, since 
experimenter and subject are joint and independently situated witnesses of the necessity of the 
relations and their markers (Strawson, 1959; Bruner, 1973). Thus to analyse the particular 
situation of this experiment: it is one in which speech-in-context is allowed to progress more or 
less closely with interleaved action (production as opposed to descriptions of relations). The two 
contrasted groups (MC and LWC) are all ‘speakers’, but to some it is quite normal to refer to 
relations of various kinds (MC), to others, this is a less familiar use (LWC). Clearly we have a 
situation where the MC group are ready to operate, using speech in the mutual exchange of 
dialogue as a cognitive instrument. Thus pre-test and ‘language training’ together ‘trigger’ 
progress in further more competent production of relations seen in post-test I. The LWC have 
not the ready priming of descriptive usage but given some reorientation through ‘language 
training’, followed by the production of relations in post-test I are now ‘triggered’ for the 
complicated transaction of detachment and objectification of descriptions of relations. As to thé 
pattern of ‘logical’ progress itself, we may be able to relate this to the influence of ‘language’ in 
at least three ways which are combined: (1) quantity at differential rates; (2) quality; (3) 
sequential timing. These refer to: (1) the fact that there was appreciably more progress in all 
cases where dialogue had intervened, bearing in mind also that on selection at equal seriation 
levels the MC group were younger than the LWC though they were more advanced in 
descriptions. The ‘interaction hypothesis’ is then forced upon us by the prompt response in 
‘logic’ of subjects ready with descriptions; (2) our results suggest that precise qualitative levels 
of ‘descriptive readiness’ may relate directly to concomitant levels in ‘logic’. The bulk of 
progress in seriation was a matter of ordinal assembly of rods based on perceptual cues and 
strategies. This coincided with descriptive readiness in the reversible use of ‘bigger/smaller’. 
Then, at the final point of change, the adoption of ‘logical’ (‘double-comparison’) strategies 
occurred among subjects who were ready with the apposite ‘three element’ description; (3) the 
necessary ingredients for interaction appear to include the sequential timing of descriptive 
readiness with logical production where one precedes or follows the other. We have then a 
possible ‘triggering’ of ‘language’ and ‘logic’ interaction where quality of reference is precisely 
primed. , 

These findings, thus interpreted, do not contradict the language facilitation model of Piaget; 
rather they go further, in that they stress and elaborate the influence of speech. Likewise, 
Bruner’s original suggestion that language can be an instrument of a social influence is 
endorsed. Indeed Bruner’s more recent opinion admits the ‘cognition hypothesis’ of language 
acquisition which derives from Piaget (Cromer, 1974), whilst retaining his original stress on the 
importance of social linguistic influence in cognitive development (Bruner, 1975). At least at the 
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period of early and prelinguistic acquisition, current trends of thought lean towards some form 
of interaction of speech and cognition where speech is analysed by function (Macnamara, 1972; 


Ryan, 1974; Cromer, 1974; Bruner, 1973). Such a relationship may be fairly obvious in early 
ontogeny but the present findings provide suggestive evidence consonant with a similar 
theoretical position at a later period of development, with some specific indication of necessary 
components and their relative timing in the interaction concerned. Furthermore, according to 
Piaget’s scheme of emerging ‘logic’, it appears that the child passes through a period of 
perceptually dominated strategies and judgements but it is not clear how the transition to logical 
understanding occurs. In this experiment we have information relating timing and sequence of 
progress in ‘logic’ and the related descriptions which provide the intriguing, if tentative, 
possibility that at the final point of descriptive precision (three elements), speech in dialogue 
form may free the ‘logical’ strategies from their perceptual basis in the ongoing process of 
organization and production of asymmetrical size order relations (Piaget, 1950). 
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The discovery of segments in natural language 


J. G. Wolff 





The performance of a computer model of linguistic segmentation is described and evaluated when it is used 
with natural language. It identifies words quite successfully and seems to have some sensitivity to morphs 
but it performs poorly with structures larger than words. From the language samples, the program extracts 
most of the sequential redundancy and some of the redundancy due to the unequal frequencies of elements. 
This accords with the principle of economical coding in cognition (Attneave, 1954; Oldfield, 1954). The 
process seems also to model certain aspects of how children's vocabularies grow and the increasing lengths 
of the words which children acquire. It may have a bearing on the explanation of infantile amnesia and the 
word transformation effect. 





A computer model (program MK10) which was described in a previous paper (Wolff, 1975 a) was 
intended to show how the segmental structure of a language may be learned by young children 
despite the apparent absence of consistent physical markers like pause or stress for the 
boundaries of words and other linguistic segments. The performance of the model was illustrated 
mainly using artificial languages. The main purpose of this paper is to present and evaluate 
results from natural language and to relate these results firstly to the principle of economical 
coding in cognition and secondly to established features of vocabulary growth. Other phenomena 
and related work are considered in a discussion of the model's plausibility. 

Since there has been some misunderstanding of this point it is as well to emphasize that the 
problem under attack is not to identify pre-established segments in unsegmented data (see, for 
example, Reddy, Erman & Neely, 1973) but to discover what segments are implicit in the data. 
A young child has to discover the segmentation of language rather than merely recognize 
segments he already knows. 


Results from natural language 


Three texts have been prepared in the usual manner with all spaces and punctuation omitted. 
Text 1 (10000 letters) is taken from book 8A of the Ladybird Reading Series. Text 2 (20000 
letters) comes from Paul Gallicoe's novel Jennie. Text 3 (20000 letters) is adult speech taken 
from transcripts of recordings of the spontaneous speech of children and adults.* The sample 
contains roughly equal amounts of material from homes classified as ‘professional’ and 
‘non-professional’. It is taken to be representative of the kind of speech which young children 
hear. These texts have been used as data for a variant of the computer program previously 
described. This version, MK10H, covers the whole sample on every scan and at the end of each 
scan selects the most frequently occurring contiguous pair of elements (or an arbitrary choice 
amongst ties). The two versions produce essentially the same results but MK10H has the merit 
of covering exactly the same sample on every scan. This means that precise frequencies of 
occurrence of elements in the sample can be obtained which have theoretical uses considered 
elsewhere. 

Small sections (125 words each) of the resulting segmentations are shown in Tables 1, 2 and 3 
for Text 1 after 501 scans, Text 2 after 951 scans and Text 3 after 572 scans, respectively. Each 
segment was printed out with its reference number so that in the full print-out it was possible to 
trace the full hierarchical structure of every segment. These structures are marked by brackets 


* I am grateful to Dr Gordon Wells (School of Education Research Unit, University of Bristol, Lyndale 
House, 19 Berkeley Square, Bristol BS8 1HF) for making these available. 


98 J.G. Wolff 
Table 1. Segmentation of Text 1 by program MK10H (10000 letter sample, 501 scans) 


((IT)(IS)(SUMMER)(TIME)(SCHOOL)(ISXKOVER)(AND)(THE)((LONG)(SUMMER)) 
(HOLIDAY)(IS)(HERE)(JANE)((AND)(PETER))(T)(AL)(K)(ABOUT)(THEIR) 
((LONG(SUMMER)(HOLIDAY)(AND(WHAT)(THEY)(ARE)(GOINGX(TOYDO)) 
((XL))(IKE))(SCHOOL)((SAYS)(PETER)XBUT)((1)(AM))(GLAD)(THE) 
(HOLIDAYXHAS)COM)(EXYES)()(AM)(GLAD)TOO)(SAYS)JANE)) 
(XD(L)KEJSUNNY)DAYS(WHEN)(WE)HAV E)(NOYW)(ORYK) 
(TOXDO)K(THEREARE)(SOXMANYXNICEXYTHINGY(STXO)(DO) 
(INXTHE)(HOLIDAY)WHEN)(IT)IS)JySUNNYX(Y ES)(SAYS)(PETER))) d 
(ANDXDADDYYTHINK)(SXIT)(DOES(USY(GOOD)(TOYGET)(OUT) 
((IN)(THE)(SUN)((WE} WILL))(BEXOUT)(EVERY)(DAY)(WHEN)((THE)(SUN)) 
((COMESXOUT))(DO)(YOUKNOW)((THERE)(IS)(AN}(OLD)(DONKEY) 
((UP)((ATX{THE)))(FARM)(NOW)(ASKS)JANE))((HE)(IS))(TOO)(OLD) 
(TO(WORXKXPAM). . . 





Table 2. Segmentation of Text 2 by program MK10H (20000 letter sample, 951 scans) 








((HE)(HAD))(WANTEDXTO)(HOLD)((AND)(S))(TRO)(KE)((THE)(KITTEN)) 
(NANNY)(HAD)(SC)(REXAM)((ED)(AND))(THERE)((HAD)(BEEN)) 
((A)((KIND)(OF)))(AN)(AW){FUL)(B){(UMP)(AFTER)(WHICHXIT) 
((SEEMEDY(TO)(HAV E)(TURNEDXFROM)(DA Y)((TO)(N))(IGHT) 
(AS(THOUGH)(THEXS)(UN)WEREX(GONEYAND)X(IT(HAD) 
(BECOMEYQUIT(ED(ARK)(HE(ACHK(ED(AND)(SOME)WHEREJ(T) 
(HURT\HIM)(AS)((IT)(HAD))(((WHEN)(HE)HAD))}(F)(ALL)(ENXRUNNING) 
(AFTER)(AFXOO)(TXBALL)(NEAR)(AG)(RA)(VE)(L)(PIL)(E)(AND)) 
(SC)(RAXPED)(NEAR)(LY}ALL)((THEXS)K)(IN)((FROM)THE))(SID) 
((E)(OF))(ONE)LEG}HE)((SEEMEDXTO))(BE))((IN)(BED))(NOW)(AND) 
(NANNY)(WAS)THERE)(PE)(ERX(ING)((AT}(HIM)))(IN)(AN))(ODD)(WAY) 
((THATXIS)(FIRST)((SHEXWOULD)BE)(QUITX(ECXLOSE))((TO)(HIM)) 
(SOKCLOSE)(((THAT)(HE))(COULD){SEE))(HOW)(WHITE)(HER)(FACE)(WAS) 
(INSTEAD)((OF)(T)(SUXSU)(AL)(W)(RI(NK)(LED)(PINKKCOLOU). . . 








Table 3. Segmentation of Text 3 by program MK10H (20000 letter sample, 572 scans) 








(NOXDARLINGX(NOXNO)X(NOYNO)XYES)WHENJTHEY)(ARE) 
(WAS)(HE)(DY)(OUCAN)(NOT)(BEFORXE)(OH)(SHE)(S)(NOT)((SHEXS)) 
(NOTXAD)(IRXTY)(CAT)(ARE)(Y OU) (DARLING)(NO)(LEAVE)((MUMMY)(S)) 
(WAS)(HING)(ALONE)(PLEASE)((MUMMYXS))((GOT)TO))(WAS)(HAXLL) 
(THAT)(YES)((THERE)(S)(¥ OUR)(SO)(CK)(SM)(UM)(MY)(SW)(AS)(HING) 
(THEM)((I)( VE)X(GOT)(TO))(DO)(ALL)(THATXNOW)(YES)(LIND)(AB) 
(OUGHT(YOUXSOXCK)(ST)(HE)KYYREYDD(R(TY)THEYYVEY(GOT(TO) 
(BEXWAS)(HE)(D)(PARDONXNO)((MUMMY)(S)((GOING)(TO))(WAS)(H)(THEM) 
(NO)(LIND}(AS)((NOT)((GOING)(TO)){WAS)(H)(THEM)(NO)(MUMMY)(WAS)(H) 
(THEM)(PARDON)(PARDON)(((THAT)(S)KNOT))((YOURXB))(RO)(OMCH) 
THATY(S)(YOURYM)XIC(ROP(HONEXNO)TT(SYNT)(NOXYOU) 
(WONT(FALLY((NOWYSTOPY(THATY(S)üLLXYYNO)JISXEPXLE)(ASE) 
(YES)(IN)((AM)(OMENT)XDEAR)(PARDON)(SH)(OW)(ME) .. 
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Table 4. The correspondence between printer's words and structures assigned by MK10H to 
natural language samples 





Text 1 Text 2 Text 3 


O E p O E # O E ë 





Printer's words marked by a node 116 3l 233 101 19 354 096 32 128 
Printer's words not marked by a node 9 94 TI 24 106 64 29 93 44 
310* 418* 172* 








* All values of x? have one degree of freedom. 


in Tables 1, 2 and 3 only to sufficient depth to show whether or not each printer’s word 
corresponds to a node. 


Evaluation of results 


Printer’s words. Table 4 (columns O) summarizes the correspondence between printer’s words 
and nodes assigned to the text by the program. In Text 3 the contractions like ‘Mummy’s’, 
‘I’ve’, ‘there’s’, were each counted as two words. ‘Won’t’ and ‘Mummy’s’ in the possessive 
sense were each counted as one word. The null hypothesis chosen for the purpose of a assessing 
the statistical significance of these results was that the structures had been assigned to the text in 
random order. Accordingly, to each 125-word sample the full hierachical structures assigned by 
MKIOH (not merely those shown in Tables 1, 2 and 3) were reassigned in a random order. For 
each sample the most ‘successful’ of three such ‘segmentations’ formed the basis for the 
expected values shown in Table 4 (columns E). The values of chi squared calculated from these 
observed and expected frequencies and shown in Table 4 allow the null bypothesis to be rejected 
with confidence in all three cases (P« 0-01). 


Errors in identifying words. Experience of running MK10 on artificial texts suggests that errors 
arise from three main sources: too small a sample, insufficient processing and the existence of 
high frequency letter sequences bridging word boundaries. Without repeated and expensive 
processing of natural language texts under varying conditions it is not easy to establish the 

, origins of particular errors but tentative assignments may be made. Thus where a word is merely 
fragmented (e.g. (AW)(FUL) in Table 2) it is likely that longer processing will correct the error. 
It often happens in the early stages of processing that a high frequency sequence like TH results 
in the wrong segmentation of a sequence like... WHATHE...However when WHAT and HE 
have been built up this error is corrected. A possible example of this type of error in Table 2 is 
the sequence (QUIT)(ED)(ARK). If the sample is large enough for QUITE and DARK to occur 
in a range of other contexts then they will be built up by further processing and this error will be 
corrected. If the sample is too small then no amount of additional processing will correct the 
error. Much the same interpretation applies to the sequence (PE)(ER)((ING)((AT)(HIM))) in 
Table 2. Whether or not this error is corrected is likely to depend on sample size. 

But there is a type of error described in the previous paper which may be termed a 'run-on' 
error and which is not corrected either by increasing the sample size or extending the 
processing. An example from Table 2 is the sequence (AF)(OO)(T)(BALL). The element (AF) is 
formed as part of the word AFTER and its presence in the dictionary of elements means that the 
sequence AFOOT will always be wrongly segmented because the program always seeks the 
largest element which matches a given letter sequence - in this case AF is preferred to A. An 
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example from Text 1 (not shown in Table 1) is the sequence.... 
HEISTOOOLDFORTHATNOWEMUSTNOTGETONHISBACK. . . which is segmented as 
((HE)(S))(TOO)(OLD)(FOR)(THAT)((NO)(W))(E)(MUST)(NOT)(GET)(ON)(HIS)(BACK). 
It is interesting that in reading this text one is apt to make the same ‘run-on’ error - to read 
NOW rather than NO WE.... 


Other structures. There is some evidence, from an examination of the internal structures 
assigned to words, that MK10 is sensitive to linguistic structures smaller than words. Examples 
from Text 1 (not all shown in Table 1) are ((GO)(ING)), (TELL)(S)), (GO)(ES)), ((DO)(ES)) 
and ((DONKEY)(S)) in which morphs are marked by the program. It is also noticeable that a 
number of the errors made by the program involve the mis-assignment of the terminal S on 
verbs and plurals, e.g. (THING)(ST)(O)XDO), (THINK)(S)IT), (LOVE)((S)(HIM)), 
(GIVE)((STHE)(M)), (EAT)(SF)(R)(OM), etc. These are clear errors but they suggest that 

the program is sensitive to the division recognized by linguists between a verb or noun and the 
terminal S. There are however other cases, e.g. ((S)(AYS))((AS)(KS)) and ((COM)(ES)) where 
the conventional divisions are not marked. 

Program MK10 has now been run sufficiently far on natural language to be able to say that, in 
its present form, it performs poorly in identifying linguistic structures larger than words. Some 
coherent noun phrases and prepositional phrases are picked out (e.g. (THE)(DONKEY)), 
((FOR)(HIM)), etc.) but conventional groupings are often violated (e.g. (((TO(THE)(STABLE)), 
((D(AM)XGLAD), ((SAYS)(PETER))(D), etc.). The tentative reasons advanced here for this 
relative failure are these: 

1. The process almost certainly needs to operate in conjunction with some kind of 
classification process. The main reason for this is that combinations of specific words are likely 
to occur so very infrequently that a clustering process based on conjoint frequencies will never 
get very far whereas particular combinations of classes of words may be expected to occur more 
frequently. The details of the structures formed above word level are likely to be influenced by 
the classification process in ways which cannot at present be specified. 

2. Whereas the distributional reality of words without reference to meanings has been 
demonstrated it may be that the forms taken by larger structures are influenced partly by 
intralinguistic relations and partly by relations existing between intra- and extra-linguistic 
entities. 

3. MK10 suffers from much the same deficiencies as simple phrase structure grammars 
(Chomsky, 1957; Postal, 1964): it cannot handle recursive structures (e.g. ‘Roger, Jane, Liz... 
and John’) or discontinuous constituents (e.g. ‘If. . .then. . ."). The latter problem is considered 
in Wolff (1975 b). 


Redundancy 


After 501 scans program MK10H had 527 elements in store and had divided the 10000 letters of 
Text 1 into 2548 segments. A message using an 'alphabet' of 527 elements requires not less than 
ten bits per element (10 > log; 527 > 9), so this text could be transmitted or stored using 
2548x10 — 25480 bits. If the text were transmitted simply as alphabetic characters, each one 
requiring not less than five bits (5 > log; 26 > 4), then 10000x5 = 50000 bits would be required. A 
saving of 49 per cent has been achieved. Similar calculations for Text 2 and 3 show savings of 
41 and 37 per cent respectively. 

A slightly greater saving would probably have been effected if either 512 or 1024 elements 
were in store so that nine or ten bits per element could be used without wastage (log; 512 —9; 
log 1024 = 10). Alternatively some additional coding process, such as that described by Edwards 
(1969, pp. 60-63), could be employed to avoid this wastage. 

The savings obtained are not far short of the estimated figure of 50 per cent for the sequential 
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redundancy of English given by Garner & Carson (1960). It seems that MK10 successfully 
extracts most of this redundancy from the text but a small part of the saving can be attributed to 
a reduction of the redundancy due to the unequal frequencies of elements. For example, the 
redundancy of Text 2 from this source, calculated using the Shannon-Wiener formula (see 
Edwards, 1969, pp. 42-45), is 11-6 per cent prior to analysis by MK10H. After 951 scans by 
MK10H this redundancy has dropped to 6-4 per cent. 

In general, the rather dramatic reduction effected by MK10H in the redundancy of natural 
language texts shows that MK10H is consistent with the principle of economy in cognition put 
forward by Attneave (1954) and Oldfield (1954). 


Vocabulary growth 


The ambiguous title of this section refers to the facts of increasing numbers of words in 
children's vocabularies and of the increasing size of the words themselves. Both phenomena will 
be considered. 


The rate of acquisition of words 


The estimation of children's (and adults") vocabularies is fraught with problems (see Seashore & 
Eckerson, 1940; M. K. Smith, 1941; Ellegård, 1960) but, notwithstanding these difficulties, a 
broad pattern can be recognized. First words appear at about 12 months. In the following few 
months new words are acquired slowly but then the pace accelerates dramatically up to about 36 
months (Grant, 1915; M. E. Smith, 1926). Throughout the rest of childhood the rate of 
acquisition of new words seems to be fast without showing much sign of slackening (Magni, 
1919; Gansl, 1939; M. K. Smith, 1941; Hamlin, 1944). Gansl suggests that the acquisition curve is 
smooth while M. K. Smith finds it to be rather irregular. M. K. Smith's estimate of the rate of 
acquisition is about 2700 words per year on average throughout childhood. Towards adulthood 
the rate seems to slacken (Hamlin, 1944) and decline throughout the adult years reaching a 
relatively low level in old age (Clément & Bourliére, 1959). One of the several problems in 
estimating someone's vocabulary is establishing a criterion for ‘knowing’ a word. Receptive and 
expressive criteria have been employed variously in the literature so the general pattern outlined 
here is a composite one which is assumed to apply to both receptive and expressive vocabulary. 

There is a suggestion that the rate of acquisition of words is higher in the third year than at 
any other time. Thus Lenneberg (1967) writes that 


An astonishing spurt in the ability to name things occurs at a definite stage in language development. It 
represents the culmination of a process that unfolds very slowly until the child reaches the age of about 18 
months, when he has learned to utter between three and 50 words. Then, suddenly and spontaneously, the 
process begins to gather momentum. There is a burst of activity at 24 to 30 months, so that by the time the 
child completes his third year, give or take a few months, he has built up a speaking vocabulary of more than 
a thousand words and probably understands another 2000 to 3000 words that he has not yet learned to use. 
This ‘naming explosion’ is only one of the extraordinary activities which mark the coming of language, 

perhaps the most human and least understood form of our behaviour... (p 59). 


Lenneberg refers to a spurt in the ability to name things rather than to a peak in the rate of 
growth of vocabulary, probably because nouns predominate in young children's vocabularies 
(Pelsma, 1910; Grant, 1915). But it seems from a search of the literature on vocabulary studies 
(see Dale, 1963, for a bibliography) that the evidence for a peak stems exclusively from M. E. 
Smith's (1926) study which deals only with overall vocabulary. Other studies (e.g. Pelsma, 1910; 
Boyd, 1914; Nice, 1917) do not record vocabularies at sufficiently short intervals to show 
whether or not a peak occurs in infancy. Smith's study suffers from being a cross-sectional 
study: the peak could simply reflect differences between populations at different ages. Her 
method of estimating vocabulary sizes lacks the sophistication of later studies (see Ellegard, 
1960, in particular) and it is possible that a systematic underestimation of the larger vocabularies 
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Figure 1. The number of words isolated from Texts 1, 2 and 3 by program MKIOH at different stages of 
processing. 


of the older children could produce an artifactual peak. So it seems that the ‘naming explosion’ 
should be treated with caution. A less tendentious assertion would be that the rate of acquisition 
of words in the third year is at least as high as at any other period. 

When program MKIOH is run on a text, one element is added to the program's dictionary for 
every scan of the text. If successive scans of the text are assumed to represent equal intervals 
of time in a child's development then we may plot the numbers of words formed in successive 
blocks of scans against the number of scans (or the number of elements in the dictionary) and 
compare the resulting curve with how children acquire words. 

Such plots for Texts 1, 2 and 3 are presented in Fig. 1. To iron out the major irregularities in 
the data a moving average technique is used so that each point shows the number of words 
amongst 78 elements consecutive in the program's dictionary and overlaps with its neighbours by 
26 elements. Despite this method of plotting, some fairly marked irregularities remain. It seems 
from a trial plot of random numbers on the same basis that most of the peaks and troughs can 
be explained on the assumption that words and non-words are produced in a random sequence. 
Only the major trends in these three curves should be regarded as significant. 

These trends can be summarized as follows. All three curves start at a relatively low level; 
they rise to their highest level at an early stage and then descend gradually thereafter. This 
pattern is most marked for Text 1 and least apparent, but still present in the curve for Text 2. 
The pattern is not unlike the pattern for children's acquisition of words but with two points of 
difference: (1) these curves do not start at zero; (2) with the possible exceptions of the curves 
for Text 1 and Text 3 there is not much sign of a plateau corresponding to the previously noted 
plateau in the rate of acquisition of words during childhood. As will be suggested below, it may 
be that both these differences can be regarded as artifacts of the modelling process. 

Why should the rate of acquisition of words by both children and the program follow this 
general pattern? A tentative explanation for both cases follows. 

In children the acquisition pattern must be influenced in part by the physical maturation of the 
brain but in this discussion this influence will be ignored. The explanation of the form of the 
acquisition curve offered here is based on the segmentation process embodied in program 
MKIOH, described previously (Wolff, 1975 a). It is assumed that babies are equipped with 
minimal perceptual elements or analysers of some kind and it seems reasonable to suppose that 
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these are considerably smaller than words. In the first few months much of the ‘data processing’ 
effort will be concerned with building composite elements from these minimal elements and a 
child will not show much sign of ‘knowing’ any words until he starts to assemble these 
composite elements into words. Hence the interval between birth and the advent of first words. 
In the next few months the acquisition of words will be slow because a limited variety of 
building blocks are available and most of the processing is still concerned with creating new 
building blocks. As more become available the rate of acquisition of words increases. 

The initial acceleration in the isolation of words by MK10H can be explained in this kind of 
way. For example, the formation of the element ING means that SING, GOING and THING 
can all be formed relatively easily. In a similar way the formation of GO, itself a word, 
facilitates the formation of GOES and TH enters into THE, WITH, THEM, etc. The program 
differs from the child in that alphabetic characters are probably much closer to being words than 
are the child's minimal elements; indeed, two letters (A and I) are themselves whole words. This 
is why the program’s curves do not start from zero. 

Following the initial acceleration, the rate of acquisition of words by both child and program 
stabilizes and then declines. Two factors seem to be responsible for this: 

1. In the program, word combinations start to be formed and they augment the numbers of 
non-words amongst the elements being formed. One may suppose that in a similar way when 
children start to combine words, an increasing proportion of their processing ‘effort’ will be 
concerned with assimilating linguistic patterns larger than words and this will help to stabilize the 
rate of acquisition of words. 

2. There is some evidence that rare words differ in their structure from common words and, 
given that rare words are recognized later than common words by both program and child (see 
below), this could explain a slowing in the rate of isolation of words. Thus Kinsbourne & Evens 
(1970) have found that frequent words contain letter sequences of higher diagram and trigram 
frequency than rare words. Landauer & Streeter (1973) found that significantly more words 
could be formed from common four-letter words by changing one letter than from rare 
four-letter words. In terms of this theory of word acquisition these results mean that common 
words are formed from a smaller variety of constituents than rare words and so their rate of 
formation can be greater. 

In comparing the child’s acquisition of words with the program’s it is important to emphasize 
that there are huge differences in the amounts of ‘data’ available in the two cases and in the 
actual numbers of words isolated. The data available to the program can be literally exhausted of 
all their words in a way which is hardly possible for the corpus on which the child operates 
(Zipf, 1935; Howes, 1964). This difference may explain the relatively steep decline in the rate of 
isolation of words by MK10H compared with children. Given a corpus more similar in size to 
that which children are exposed to it may be that MK10H would show a similar plateau. 


Word sizes 


Figure 2 shows the average length of the words acquired by one child plotted against her age 
(Grant, 1915). There is a clear increase in the length of words during the second year with signs 
of a levelling off at the end. Although the data do not extend beyond 26 months we can deduce 
that the curve must level off at later ages: the rate of increase between 12 and 24 months is 0-125 
letters per month and if this rate did not slacken the average length of words acquired when the 
child was ten would be over 17 letters. 

This curve may be compared with the curves in Fig. 3. These show a clear increase in the 
average sizes of words isolated by the program with similar levelling at later stages. It should be 
emphasized that this effect is not simply an artifact of the way the program builds larger 
structures from smaller; after only five scans, words as long as 32 letters could be formed. The 
effect is almost certainly a reflexion of the fact (see Hérmann, 1970, pp. 86-87) that frequency 
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Figure 3. The average lengths of words isolated from Texts 1, 2 and 3 by program MKIOE 
of processing. : 


and word length are inversely related. The fact that the lengths of the words acc 
child increase in this way is indirect but quite clear evidence that children acquii 
least approximately decreasing order of frequency, in the way that MK10H pred 
deceleration in the rate of increase simply reflects the fact (Zipf, 1935) that the r 
different rare (longer) words is greater than the number of different common (sh. 


Discussion 


In assessing the strengths and weaknesses of a segmentation model one should c 
consider the accuracy with which linguistic segments are isolated. The processe: 
also be plausible psychologically. If the model can also suggest an explanation o 
psychological phenomena then its plausibility is enhanced. 

Previous attempts to segment language by distributional means have been mac 
(Harris, 1955; Gammon, 1969) mainly interested in developing linguistic tools an 
psychologists (Stolz, 1965;'Olivier, 1968) trying to model psychological processe 
successful of such attempts and perhaps the most ingenious is Olivier's which u: 
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shortest-path algorithm borrowed from the field of operational research to find maximum 
likelihood segmentations of the sample text. 

The chi-squared values shown in Table 4 compare favourably with that of 150-6 (1 d.f.) 
obtained by Olivier from political speeches. This suggests that MK 10H is generally the more 
accurate method for identifying words. The comparison is not a direct one, however, because 
the texts differ in the two cases as do the methods of calculating chi-squared values. 

As a model of psychological processes MK10H has one striking advantage over other 
methods — on any one scan of the sample it segments the material directly in one left-to-right 
pass without resort to iterative procedures or the like. This is in keeping with the way we seem 
generally to be capable of segmenting speech directly as it is heard. 

MK10H also has the merit, mentioned previously (Wolff, 1975a), of suggesting how frequency 
information and context may influence pattern recognition. These points and others relating to 
pattern recognition are considered in detail in another paper in preparation. 

The program effects an economical recoding of its input data in keeping with the principle of 
economy in cognition and it seems to model certain aspects of children's vocabulary 
development. It may have a bearing on two other psychological phenomena which may be 
mentioned briefly. 

MKIOH builds memory structures which are used for the subsequent identification and storage 
of perceptual events. If similar processes are assumed to operate in cognition generally then we 
have a natural means of accounting for the phenomenon of infantile amnesia - our inability to 
remember much from our earliest years. The explanation would be simply that in our first few 
years we have not built the memory structures needed to interpret and store the events which 
impinge on us. 

Another phenomenon which is explicable in terms of this model is the verbal transformation 
effect. Warren & Gregory (1958) report that when a word is presented to subjects repeatedly the 
perception of it may change. A repeating sequence of ‘say’ may be heard after a time as ‘ace’ 
and then perhaps as ‘say’ again. ‘Rest’ may be heard as ‘tress’, ‘stress’ or even ‘Esther’. 
These examples correspond to a shift in word boundaries. The effect of repetition is to artificially 
augment the frequencies of sequences bridging the original word boundaries. This means that a 
process like MK10H which is sensitive to high frequency sequences will tend to pick out these 
alternative sequences as legitimate segments. 

One of the assumptions of this work has been that a process for discovering the segmentation 
of language is part of the process by which a child constructs the grammar of his native 
language. This model is seen as a step towards a language acquisition model. The data and 
arguments presented to date are intended to show that the kinds of process put forward are 
sufficiently plausible to merit further development into a language acquisition model. 
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The other side of Johnson-Laird’s interpretation of the passive voice 


J. Costermans and M. Hupet 


Two studies by Johnson-Laird (1968, b) were guided by the notion that the function of the passive 1s to 
emphasize the importance of the logical object (LO) by placing it in the grammatical subject position. 
However, the a priori meaning assigned to the subjects’ responses implied that ‘importance’ attaches to the 
entity which is represented by the larger area in a coloured strip. It will be shown that this is merely one of 
two possible interpretations for it may just as easily be arguéd that importance attaches to the narrower area 
which might be regarded as representing the focal point of the new information imparted by the sentence. In 
the present study, subjects were required to refer sentences which involved a clear distinction between given 
and new information to different types of coloured figures. Pattern of responses clearly shows that subjects 
were more likely to refer a particular sentence to the figure in which the colour stressed in the assertional 
focus of the sentence is physically the less important. Present data as well as Johnson-Laird's are thus 
interpreted as supporting the recent view according to which the logical subject (LS) is really that which is 
emphasized in passives as the assertional focus while the LO is presuppositional. 


Considering that an essential distinction between the active and passive voices may rest in their 
relative usage, psychological attention has focused upon the function of passives in language 
use. Since the main structural difference between the two voices is in the position of the logical 
subject (LS) and logical object (LO), studies have investigated the communicative functions of 
this word-order difference, concentrating mainly on the importance of the grammatical subject 
(GS) at the beginning of a sentence. Several experiments (Wright, 1969; Tannenbaum & 
Williams, 1968) have validated a contextual contiguity function by demonstrating that there is a 
tendency for congruence between the GS and the preceding context, thus suggesting that the GS 
is selected to provide a discourse continuity. In contrast, two studies by Johnson-Laird (1968 a, b) 
have attempted to assess directly the emphatic function of the GS. These studies were guided 
by the notion that the function of the passive is to make the LO more prominent by placing it in 
the position of the GS at the beginning of the sentence. The sentences used as experimental 
material had the form blue follows (or precedes) red and red is followed (or preceded) by blue. 
The subject's task was basically to find or express correspondences between these sentences and 
coloured strips, with it being assumed that the size of areas representing GS and GO provides 
‘an index of importance’. The following predictions were confirmed: (1) The GS would be 
assigned a larger area in the strip than the GO, and (2) the difference between size of GS and 
GO would be larger for the passive than for the active. These findings were interpreted as 
supporting the idea that the function of the passive is to emphasize the importance of the LO by 
making it the GS, the passive implying thus that ‘the LO is more important than the LS’ (1968 b, 
p. 7). 

Such a claim, however, seems to be contradicted by more recent studies which have 
investigated the functions suggested by the GO position. As mentioned above, previous research 
concentrated on the importance of the GS when occupying first position in the sentence without 
paying much attention to the GO. It is now indisputable, however, that the GO of passives is 
also of importance, for it has been clearly determined (Klenbort & Anisfeld, 1974) that it was 
interpreted by the subjects as the focal point of the new information imparted by the sentence 
and as the carrier of overall responsibility for the sentential proposition, while the LO at the 
beginning of passive sentences was interpreted as presuppositional and thus regarded as known 
and established. 

Since both LO (Johnson-Laird, 1968, b) and LS (Klenbort & Anisfeld, 1974) of passives have 
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been found dominant, a paradox arises in the interpretation of the passive voice. It has been 
proposed, however, that the GS and GO are important in different senses (Klenbort & Anisfeld, 
1974); the former as the theme or topic of the sentence, and the latter as the focus of the 
sentential assertion. The question now is whether Johnson-Laird’s data could be reinterpreted in 
accordance with this proposal. One possible method of achieving this may arise from 
examination of Johnson-Laird’s experimental procedure, and particularly of the a priori meaning 
he assigned to the subject’s responses. Johnson-Laird’s interpretation implies that ‘importance’ 
attaches to the entity which is represented by the larger area in the coloured strip. This is merely 
one of two possible points of view. Indeed, it may just as easily be argued that ‘importance’ 
attaches to the narrower area, even more so if one considers that in normal language use the 
speaker stresses the aspects of his environment which are less evident (Chafe, 1974). In this 
case, the larger area would represent the given material which the speaker assumes is obviously 
known to the listener, and the narrow area would represent the new information which the 
speaker assumes is not. 

The three short experiments to be reported were designed to test this alternative interpretation 
of the subjects’ responses. No passive sentences were used, however, because sentence 
structures allowing both the contextual referent (or given information) and the assertional focus 
(or new information) to be clearly identified were needed. The French sentences used as 
experimental material were plotted against coloured areas in order to establish to which terms 
(referential or assertional) the larger and narrower areas would be mapped. 


Experiment I 
Methods 


In this first experiment, the sentences used were as follows: 

(1a) Aprés le rouge, il y a du bleu 

(1b) Avant le rouge, il y a du bleu 

(1c) Aprés le bleu, il y a du rouge 

(1d) Avant le bleu, il y a du rouge 
These may be translated as: (1a) after red, there is blue; (15) before red, there is blue; (1c) after blue, there 
1s red; and (1d) before blue, there is red. According to the structure of these sentences, only one of the 
colour terms (blue in 1 a and 1b, red in 1 c and 1d) seems to be part of the focal assertion, the other one 
being relegated in a prepositional phrase indicating a space reference. Moreover, the opposition of the 
articles ‘le’ (the) and ‘du’ (a), respectively definite and indefinite, stresses the referential or presuppositional 
nature of the prepositional phrase in contrast with the informational nature of the base proposition (Hupet & 
Le Bouedec, 1975). In this respect, a better translation might be: After the red area, there is a blue one, etc. 

Two coloured strips similar to those of Johnson-Laird were used. They were 10x2 cm large. One was 2/3 
red and 1/3 blue, and the other just the converse. 

Sixty native French-speaking subjects served as volunteers in this experiment. They were given one of the 
sentences in written form, together with the two appropriate figures in such a manner that the only difference 
between these two figures was in the colour proportions. They were asked to indicate which figure was best 
described by the sentence. 


Table 1. Number of choices of either figure (red larger or smaller than blue) for the four 
sentences 


R>B B>R 
(1a) Aprés le rouge, il y a du bleu 24 6 
(1b) Avant le rouge, il y a du bleu 26 4 
(1c) Aprés le bleu, il y a du rouge 8 22 
(1d) Avant le bleu, il y a du rouge 2 28 
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Results 


The number of choices of either figure for the four sentences is indicated in Table 1 (R > B refers 
to the figure where the red area is larger, and B > R to the figure where the blue area is larger). 

In accordance with our hypothesis, more than 80 per cent of the subjects referred the 
sentence to the figure in which the colour stressed in the assertional focus is represented by the 
narrower area. 


Experiment II 


To ascertain that there cannot be any doubt about the location of the referential and assertional 
parts of the sentences, the experiment was replicated with slightly different material which 
presumably may be considered yet more appropriate, although deviating even more from that of 
Johnson-Laird. 


Method 


Two matrites of 12x12 red and green dots were constructed. One containing 132 red dots and 12 green dots 
randomly distributed, and the other 12 red and 132 green. The following sentences were used: 

(2a) Outre le rouge, il y a du vert i 

(2b) Outre le vert, il y a du rouge 
These can be translated as: (2a) as well as red, there is green; and (2b) as well as green, there is red. Or 
better, taking the articles into account: As well as the red dots, there are green dots. 

A different sample of 60 subjects served in this experiment. Thirty subjects were presented with sentence 
(2a) and 30 others with sentence (2b). Every subject saw both figures. In each group, 15 subjects saw the 
figure with a majority of red dots at the left, and 15 had the reverse presentation. Again the sentence had to 
be referred to one of the two figures. 


Table 2. Number of choices of either figure (majority of red or green dots) for the two sentences 











R>G G>R 
(2a) Outre le rouge, il y a du vert 28 2 
(2 b) Outre le vert, il y a du rouge 4 26 








Results 


The results are indicated in Table 2 (R > G refers to the figure with a majority of red dots, 
G » R to the figure with a majority of green dots). 

The conclusion is in the same direction as previously, since 90 per cent of the subjects 
referred the senténce to the figure in which the colour stressed in the assertional focus is the less 
frequent. 


Experiment III 

According to comments of some subjects in Expts. I and II, the use of different articles may be 
a source of some trouble in the interpretation of the data. Indeed, the French article ‘du’ is 
somewhat equivocal and might be understood in a partitive sense. Thus, a sentence as Qa) 
Outre le rouge, il y a du vert could be taken as meaning As well as red dots, there are some green 
ones. If such was the case, such a sentence would be associated with the figure where the green 
dots are less frequent, but not for the reasons we proposed. Therefore, the experiment was once 
again replicated using sentences devised to avoid the use of the article ‘du’, although the use of 
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Table 3. Number of choices of either figure (majority of red or green dots) for the two sentences 











R>G G>R 
(3a) Outre les points verts, i! y a les points rouges 6 19 
(3b) Outre les points rouges, il y a les points verts 22 3 








two definite articles in such a sentence framework does not follow the most normal French 
format. The following sentences were used: 

(3a) Outre les points rouges, il y a les points verts 

(3b) Outre les points verts, il y a les points rouges 
The figures and the experimental design were as in Expt. II. The task was presented to a further 
50 subjects. The observations gained from this replication are shown in Table 3. 


Discussion 


The matching procedure used in the present experiments has often been used in previous studies 
to highlight the effects of perceptual input on language processing. It was the aim of the present 
study to show how the interpretation of these effects can be misled by the nature of the 
questions one asks. Johnson-Laird's interpretation of the passive voice attempts to answer the 
question of whether the LO of a passive should be considered the important entity of the 
sentence. This implies that there must be an entity which is more important than another, the 
notion of ‘importance’ remaining, however, fairly mysterious and very much dependent upon 
the experimenter's own conception of the term. As concerns Johnson-Laird's interpretation, it is 
clear that it rests on the assumption that what is physically predominant in the real world 
context has to be predominant in the particular surface structure form realized in natural 
language use. On this assumption, it was predicted that the entity representing the larger area.of 
an asymmetrical stimulus would be placed at the beginning of a normal or inverted passive 
sentence, since it was hypothesized that the passive is used to indicate that importance is 
attached to that entity which occupies initial nominal position in the sentence. 

However, corroborating recent studies on the communicative function of the passive voice 
(Hornby, 1974; Klenbort & Anisfeld, 1974), results of the present experiments suggest on the 
contrary that a sentence is more likely to stress what the speaker thinks is less evident to his 
audience. This, most of the time, results in focusing elements which are physically less evident 
or unexpected. In a complementary short experiment, 30 subjects were presented with both 
figures used in Expts. II and III, and asked to indicate for which of these two figures the 
statement There are green dots was the most suitable; 28 subjects chose the figure with a 
minority of green dots. Therefore. it is at least questionable to determine which is the most 
important entity of a sentence only by examining what 1s physically more important in the figure 
to which the sentence has been associated. As it is clearly shown in the present study, what is 
emphasized in a sentence depends less on the physical features of a figure as such than on the 
interpretation that a speaker makes of these features with the goal of communicating an 
integrated semantic description. For much the same reasons that some sentential structures are 
to be interpreted as involving an assertional and presuppositional component, it can be 
considered that some figures, of the kind we used as well as did Johnson-Laird, are to be 
interpreted as involving some elements (e.g. the larger area, the majority dots) which are evident 
enough to be regarded as presupposed, and others (e.g. the smaller area, the minority dots) 
much less evident or unexpected which justify the attention the speaker will focus on them, 
making them the focus of the assertion. Therefore, if the present findings do not question 
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Johnson-Laird’s data, they contradict the way he interpreted them since it is here shown that the 
logical subject, not object, is really that which is emphasized. 
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The passive paradox: A reply to Costermans and Hupet 


P. N. Johnson-Laird 


As a result of experiments on the choice and interpretation of sentences in the passive voice 
(Johnson-Laird, 1967, 1968a, b), I have argued that one reason a speaker chooses the passive 
voice is in order to emphasize the relative importance of the entity referred to by its grammatical 
subject. In other words, speakers put first what they wish to emphasize. Costermans & Hupet 
(1977) accept my findings, but offer some cleverly contrived results of their own that they claim 
contradict my original hypothesis. 

The root of the disagreement is simply that fact that the passive voice has a variety of uses. 
One of them is, as Jespersen (1924) and others before him have pointed out, to allow the person 
or thing that is of the greater interest to be made the grammatical subject of the sentence, e.g. 
*His son was run over by a motor car' (Jespersen's example). In formulating and testing this 
hypothesis, initially in ignorance of having been anticipated by Jespersen, I assumed that other 
things being equal a reasonable index of linguistic importance would be size. If the only way 
subjects could distinguish between ‘red follows blue’ and ‘blue is followed by red’ was by 
colouring a rectangular strip, then it was reasonable to suppose that they would represent the 
more important referent with a larger area of colour. Certainly, the experiments indicated that 
individuals took the grammatical subject of the passive to refer to a larger area than the 
grammatical object. The notion of importance is, of course, rather vague and in my thesis 
(Johnson-Laird, 1967), though not in the published papers, I introduced a technical term, 
‘dominance’, for which I proposed an independent operational definition. Such a procedure is 
just as unsatisfactory as relying on the intuition that size correlates with importance. At 
least, I used to think so until I read Costermans & Hupet's claim that importance attaches to the 
smaller area of the coloured strip. Of course I do not wish to argue that a small area of a 
stimulus cannot be important: in real life, it may be crucial. But it is hard to believe it will be so 
in experiments where size is the only variable that subjects can manipulate. 

I still wish to maintain that one use of the passive is to allow the importance of the 
grammatical subject to be emphasized. There are three phenomena that seem to be decisively in 
favour of this point of view and against Costermans & Hupet's claim that it is the grammatical 
object that invariably refers to the important entity. 

First, one sort of passive is the so-called agentless form that has no grammatical object, e.g. 
‘The hostages were released yesterday’. Whatever the reason the grammatical object is omitted, 
the passive can hardly have been chosen in order to emphasize it. Such passives, in fact, are the 
most frequent sort to be found in both spoken and written texts (see Jespersen, 1924; Svartvik, 
1966). 

Second, importance ought to be reflected in the relative scope of quantifiers: the most 
important quantifier is the one with the largest scope. A pair of sentences such as: 


All philosophers have read some books 
Some book have been read by all philosophers 


are, when isolated from context, ambiguous in the same way (Chomsky, 1965). There is, 
however, a natural tendency to interpret the first sentence to mean that all philosophers have 
read some books or other, and to interpret the second sentence to mean that there are some 
particular books that all philosophers have read (see Johnson-Laird, 1969, for relevant 
experimental results). These interpretations accord with the principle of treating the initial 
quantifier as containing the second quantifier within its scope. 
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Third, in an unpublished experiment (Johnson-Laird, 1967), subjects were asked to choose 
whichever member of active and passive pairs of sentences they thought was more likely to 
occur in everyday language. It was evident that they preferred sentences with grammatical 
subjects that were definite (as opposed to indefinite), that consisted of a proper name (as 
opposed to a common noun), and that referred to a human being (as opposed to an inanimate 
entity). The subjects would choose a passive rather than an active in order to maintain these 
preferences. Once again, the preferred grammatical subject seems intuitively to denote the more 
important entity. Similar results for definiteness have been reported by other investigators (e.g. 
Grieve & Wales, 1973). 

Do these three phenomena constitute decisive evidence that a passive can be used to 
emphasize the importance of its subject’s matter? Perhaps not. Once grant Costermans & 
Hupet's assumption that the smaller of two areas of colour is thereby the more important, and 
one seems to enter into a topsy-turvy world. As the area gets progressively smaller, it 
presumably gróws in importance, and so the most important areas of all are the invisible ones. 
Indeed, one may just as well argue, first, that the most important referent in a situation is the 
one not explicitly referred to in the agentless passive (it is not referred to because it is 
invisible), second, that the smaller the scope of a quantifier, the more important it is, and, 
third, that important entities tend to be inanimate objects rather than human beings and tend to 
be referred to by indefinite nounphrases rather than by name. In short, one may as well argue 
that the reasons why a speaker chooses to say: ‘His son was run over by a motor car’ is in 
order to emphasize the importance of ‘a motor car’. A disinterested reader may be wondering 
why anyone would wish to maintain such a view. Its underlying motivation arises from another 
use of the passive voice. 

As Jespersen points out, the passive voice may facilitate the connection of one sentence with 
another. In particular, if a given topic is under discussion, it is natural to refer to it in the 
grammatical subject of the sentence and to refer to a new referent in the grammatical object. In 
order to follow this natural order, it may be necessary to couch the sentence in the passive 
voice, e.g. ‘What happened to John? He was jilted by a girl’. There is a variety of evidence in 
support of this hypothesis (e.g. Carroll, 1958; Tannenbaum & Williams, 1968; Klenbort & 
Anisfeld, 1974). 

Costermans & Hupet have performed a useful service in drawing attention to the paradoxical 
contrast between these results and mine, but their resolution of the paradox is implausible. They 
proceed on the basis of the assumption that what a sentence asserts must be more important 
than what it presupposes. They test this assumption by requiring their subjects to match 
sentences to stimuli, and they find that the following sorts of sentences: 


Aprés le rouge, il y a du bleu 

(After the red, there is some blue) 

Outre le rouge, il y a du vert 

(As well as the red, there is some green) 

Outre les points rouges, il y a les points verts 

(As well as the red dots, there are the green dots). 


are mainly matched to a strip with a smaller amount of blue (or green) than red. The subjects 
take the noun in the asserted clause to refer to a smaller strip of colour than does the noun in 
the presupposed clause. Since Costermans & Hupet take for granted that what is asserted in a 
sentence müst be more important than what is presupposed, they feel justified in inferring that 
smallness correlates with importance. Hence, they conclude that my experiments have in fact 
shown that it is the grammatical object of a passive which is the important entity. 

But the paradox can be resolved in another way. What if in some circumstances presupposed 
information is more important than asserted information? One could then argue that such 
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circumstances include Costermans & Hupet’s experiments, and consequently that the more 
important entities were matched to the larger areas of the stimuli. The crux is, indeed, the 
assumption about the relative importance of assertions and presuppositions. In discourse where 
speakers always ensure that the presuppositions of their utterances have been previously 
established, it is plausible to suppose that what is new, i.e. what is asserted, may well be of 
greater importance. But discourse is not always arranged in such an homogenized manner. 
Certain rhetorical advantages and short-cuts can be achieved by taking for granted something 
that has in fact yet to be established. (The persuasive effects of such a technique have been 
explored by Hornby, 1974.) Thus, consider the following example (see Karttunen, 1974): 


We regret that no credit is given 


This statement presupposes the critical fact that no credit is given and asserts merely a polite 
regret. Perhaps one way to ensure that presuppositions are treated as more important than 
assertions is to present sentences out of context. The presuppositions will not then have been 
established prior to presentation, and the listener (or reader) instead of merely checking the 
‘given’ information against what he already knows will have to try to establish it ab initio. The 
presuppositions will stand out by virtue of their lack of prior satisfaction. The technique of 
presenting sentences in this way is precisely the one used by Costermans & Hupet. Hence, it is 
legitimate to suppose that with their material the presupposed clauses were taken to be more 
important than the asserted clauses, especially as they were also invariably first in the sentence. 

It is now easy to explain the paradox of the passives. On the one hand, when a passive occurs 
in context and its grammatical subject denotes a given referent, then it is likely that its 
grammatical object will refer to new, and thereby more important, information. On the other 
hand, when a passive lacks a context, then it is likely that its grammatical subject will refer to 
information that is not given and which will thereby be taken to be important. Hence, 
experiments such as mine in which passives occur out of context may well establish the 
importance of the grammatical subject, whereas experiments in which passives occur in 
appropriate contexts may well establish the importance of the grammatical object. Paradox 
resolved. 

There is one final corollary. The preference for passives with a definite rather than an 
indefinite nounphrase as grammatical subject could reflect the importance of the entity referred 
to or the fact that it is the topic of discourse (or both). It has sometimes been suggested that 
topicality is all, and that a speaker chooses a definite description only because the referent has 
already been established as a topic (Hupet & Le Boudec, 1975). Once again, this assumption 
turns out to be false. As Karttunen (1974) points out, a sentence such as: 


John lives in the third brick house down the street from the post office 


may be used to give directions to someone who hitherto had no idea that there was such a 
house. Just as a passive can be used without its grammatical subject denoting given information 
so too a definite description can be used without its referent having already been identified. 
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Phrase units as determinants of visual processing in music reading : 


John A. Sloboda 


Keyboard musicians sight-read passages of music in which the amount of information about the presence of 
phrase units was systematically varied. A distinction was made between ‘physical’ unit markers, which 
allowed delineation of a unit prior to analysis of its component elements, and 'structural' unit markers, 
which defined a unit in terms of the sequential rules obeyed by its constituent elements. During the 
execution of the passages the text would be removed at a point known in advance only to the experimenter. 
Subjects were then required to execute all the material seen beyond this point to provide a measure of the 
eye-hand span at that point. It was found that the presence of structural markers increased span, and tended 
to cause span to extend exactly to a phrase boundary. The presence of physical markers did not increase 
span, although it also tended to cause span to extend exactly to a phrase boundary. The results suggest a 
clear analogy between the cognition of music and language, in that knowledge of abstract structure is of 
importance in the organization of immediate visual processing of text. 


At every level of written language groups of letters seem to be important for the reading 
process, whether the group be as small as a syllable (Spoehr & Smith, 1973), or as large as a 
phrase (Levin & Kaplan, 1970). Some groupings, such as the word, are directly visible. Each 
word is marked off from its neighbour by a space, so that to locate a word requires nothing more 
than to locate two adjacent spaces. Other groupings, such as the syllable, are not visible in this 
immediate sense. There is no particular visual event which marks the boundary between one 
syllable and the next. Syllables are defined by rules which determine the internal organization of 
their constituent elements (cf. Hansen & Rogers, 1965), so that to locate a syllable requires 
analysis of the relations between these constituent elements. This difference may be summarized 
by saying that some linguistic units are defined by ‘physical markers’ like spaces or punctuation 
marks, whilst other units are defined by ‘structural markers’, rules or constraints which relate 
elements of the text into organized groups. 

Physical and structural markers are not exclusive alternatives. A given unit may be marked in 
both fashions, as, for instance, in the word, where lexical, grammatical, and semantic 
constraints can clearly allow the reader to tell one word from the next even in the absence of 
interword spaces. In this context it is significant to consider the reception of spoken language, 
for physical markers would appear to be less frequent than in written language. Words, for 
instance, run into one another without any characteristic acoustic event to tell the listener that a 
new word has begun. In fact, there is evidence to suggest that listeners reinterpret acoustic 
events in the light of structural factors. Fodor & Bever (1965) have shown that a click heard 
near a clause boundary tends to be perceived as having occurred at that boundary. Morton 
(1974) has demonstrated that detection of a target phoneme is improved if the word including the 
target is rendered highly probable by the preceding context; and the research of Warren (cf. 
Warren & Warren, 1970) shows that perception of a phoneme depends as much on the context 
of utterance as on the acoustical properties of the message. 

Many theorists hold that we acquire reading skills which require the minimum change in the 
procedures already in existence for dealing with spoken language (cf. Kavanagh & Mattingley, 
1972). A possible consequence of this would be that readers are primed to utilize structural, 
rather than physical, markers in their analysis of written text. Weber (1970) has found that errors 
made by beginning readers are often related to the correct version by structural factors only 
(such as grammatical or semantic appropriateness). Marcel (1974) has demonstrated that fast 
readers make use of structural factors to enable a greater amount of graphical material to be 
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covered in individual fixations, and Hochberg (1970) has suggested that physical markers, such as 
interword spaces, function primarily as cues for a ‘peripheral search guidance’ mechanism 
which has.a ballistic function in relation to eye fixations. Once a segment of text is in central 
vision, physical markers may be irrelevant for ‘cognitive’ analysis of the text. 

It has been argued elsewhere that music and language have fundamental underlying 
similarities, both formally and psychologically, which makes it fruitful to undertake comparative 
studies of the two skills (e.g. Winograd, 1968; Gates, Bradshaw & Nettleton, 1974; Sloboda, 
19746). Some evidence is now available that groupings in musical notation are as important for 
the music reading process as they are for language. A unit of particular importance appears to be 
the phrase (Sloboda, 1974a). The evidence is based on a technique devised by Levin and his 
associates (Levin & Kaplan, 1970) for the study of eye-voice span (EVS) in language reading. 
This technique involves displaying portions of text for the subject to read aloud (or in the case 
of music, to execute on an instrument). At a point known in advance only to the experimenter 
the source of illumination for the text is switched off. The subject is then required to 
report/execute all the material seen beyond the point reached when the light went out. The 
number of items correctly reported in this fashion gives an estimate of the EVS. It is well 
known that EVS is dependent both on the experience of the reader and on the nature of the text 
being read (e.g. Tinker, 1958; Morton, 1964). Levin and his associates were particularly 
interested to discover the effect of placing a phrase boundary just beyond the point where the 
illumination was switched off. They found that EVS had a tendency to extend exactly to this 
boundary even when the boundary fell outside the average span of the subject. It appears that 
‘readers have an elastic span, which stretches or shrinks to phrase boundaries’. The music study 
(Sloboda, 1974 a) showed comparable results for subjects required to sight-read melodies. First, 
more accurate readers had longer spans. There was a positive rank correlation of 0-89 between 
span and accuracy of performance during the whole test. This suggested that EHS is an 
important component of sight-reading competence. Secondly, there was a significant tendency 
for the span to extend to phrase boundary. Further analysis of these results (Sloboda, 1974 b) 
showed this tendency to be most marked among the niore accurate readers. The implication, 
therefore, is that attention to phrase units is connected to the larger EHS of accurate 
sight-readers, and is, at least in part, responsible for their superior sight-reading performance. 

It was not clear from the data obtained what features of the musical text musicians used in 
identifying phrases. There are a number of possible phrase markers available in musical text, 
some physical and some structural. Music has historical roots in song, and the musical phrase 
has a clear connection with a line of spoken verse, such that the breath between two lines 
corresponds to a natural closure in the music. Rhythm (the relative duration of notes) is a salient 
feature of a melody, and the necessary period of silence between one phrase and the next 
(where a singer would take a breath) becomes an integral part of the rhythmic structure of the 
music. One phrase is thus often ‘spaced’ from surrounding phrases by the longer time interval 
that occurs at the phrase boundary. This spacing is exactly mirrored in musical notation, where 
the space to the right of a given note is proportional to its duration, so that long notes at the end 
of phrases are followed by relatively large spaces. This spacing of musical phrases is not a 
feature Of every piece of music, but it occurs so often in all but the most advanced 
contemporary styles that it is plausible to propose that readers may come to notice spaces as 
highly probable phrase markers. It is, however, possible to devise quite conventional melodies in 
which the ‘breathing time’ between phrases is reduced to a minimum, and to reduce the 
corresponding visual salience of a phrase quite considerably. In such melodies there still remain 
features which define the phrase unit. There are structural markers and they arise primarily out 
of rules for harmonic progression which serve as ‘grammatical’ rules for music (Winograd, 
1968). Within the mainstream of traditional western music the phrase is a basic unit of 
harmonic structure since each phrase unit must end with one of only four types of harmonic 
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progression known as ‘cadences’. For a musician the awareness that a musical sequence is 
progressing towards a cadential point is clear knowledge that the end of a phrase is imminent. 
The cadential sequence acts as a structural marker of the phrase unit. In contrast, the 
interphrase space acts as a physical marker of the phrase unit. So little is known about the music 
reading process that it is not easy to make informed predictions about the kind of phrase marker 
which will be most effective for musicians. One cannot, as in language, argue from facts about 
auditory reception, since learning to read music may often occur concurrently with, or even 
precede, any significant awareness of the variety of musical sound. Even if one were to 
construct such arguments they would require knowledge about the way in which untutored 
listeners analyse musical input. Such knowledge is at present in its infancy. One possible 
argument in favour of the importance of structural, rather than physical, markers is that musical 
typography is not as consistent as in printed language. That is to say, different typesetters 
employ different spacing conventions with regard to the distances between notes, and therefore 
the legibility of physical markers will tend not to be consistent. On the other hand music printed 
for learners tends to be particularly consistent in matters of spacing, on the probably justified 
assumption that this helps the learner. For instance, many British students take much of their 
musical material from Associated Board publications, which are scrupulously consistent in 
matters of typography. 

It is agreed by musicians that good sight reading is a remarkable achievement, and the 
achievement lies in being able to read at anything approaching the accepted performing speed of 
a difficult piece. Any musician can read perfectly if allowed to go slowly enough. The finding 
that accurate music readers are sensitive to groupings in the text has a familiar ring to those who 
have been concerned with the problem of speed in language reading, for two major theoretical 
approaches based on such a finding have merged to help explain how the speed problem may be 
overcome. One we may call the ‘unitization’ hypothesis (specifically related to word: 
identification by, for instance, Reicher, 1969). This supposes that a group of units may be 
analysed ‘as one’ (which could mean simultaneously). The hypothesis supposes that groups can 
be delimited before any major analysis commences, and thus it would seem necessary that 
segmentation occurs by detection of physical unit markers. Spoehr & Smith (1973) show how the 
unitization hypothesis, when applied to units like the syllable, which are not physically marked, 
must lead to a conceptual paradox. The second approach may be called the ‘redundancy’ 
hypothesis. This exists in many forms (Morton, 1969, for instance, presents a variant of this as a 
model for word recognition), and supposes that probable items require less perceptual analysis to 
be identified. Any kind of structure will create redundancy, since the effect of structure is to 
exclude large numbers of possible combinations of elements. The hypothesis requires only that 
the reader is sensitive to the structural elements in the text that will provide him with the 
relevant redundancy cues. 

Sloboda (1974 a) demonstrated how both these hypotheses could account for the tendency of 
visual span to extend to a phrase boundary. Since the unitization hypothesis requires attention to 
physical unit markers, and the redundancy hypothesis requires attention to structural unit 
markers, it becomes possible to test these hypotheses by presenting readers with passages 
lacking either one or the other type of phrase marker. If the phrase boundary effect occurs when 
only physical markers are present, then the unitization hypothesis is indicated. If the effect 
occurs when only structural markers are present then the redundancy hypothesis is indicted. The 
following experiment was designed to test these hypotheses. It should be noted that unitization 
and redundancy are not mutually exclusive mechanisms. Both types of process could operate on 
the same material. If this were the case then phrase boundary effects would be predicted when 
- either structural or physical phrase markers were present. However, in such a case, a phrase 
boundary should be even more effective when bounded by both types of marker. Furthermore, if 
structural and physical markers were truly additive in their effect on behaviour then we should 
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expect a lack of statistical interaction between their effects. The experimental design was such 
that an interaction of this sort could be tested for. 


Method 
Materials 


Thirty-six eight-phrase sequences were constructed comprising nine matched sets of four sequences each. A 
matched set contained the following: 

(a) A simple diatonic melody in the style of a folk or hymn melody where each phrase boundary was 
marked by the presence of a cadence and a relatively long interphrase space (physical and structural 
markers). 

(b) A sequence identical to the first in rhythm, and thus identical in spacing, but with the pitches assigned 
to values which made harmonic nonsense of the sequence (physical markers only). 

(c) A sequence with comparable harmonic structure to (a) but without interphrase spaces (structural 
markers only). 

(d) A sequence identical to (c) in rhythm but with the pitches assigned to values which made harmonic 
nonsense of the sequence (no phrase markers). 

The nine sets comprised three with phrase length of five notes, three with phrase length of seven notes, 
and three with phrase length of nine notes. A critical phrase boundary was chosen for each set. This 
boundary was never earlier than the boundary between third and fourth phrases, and was never less than 
seven notes from the end of a line of notation. During the experiment, the point at which the sequence was 
made unavailable could be either 5, 7, or 9 notes prior to the critical phrase boundary. This distance was 
constant for a given subject on a given matched set, but systematically varied in all other respects to give 
equal numbers of readings in each cell of the design. 

An attempt was made to control for rhythmic similarity across critical phrases of different lengths by 
ensuring that the phrases contained no syncopations. A note was placed on each major beat, and also on 
some subdivisions when melodic context required it. No beat was subdivided more than once, and the only 
note ever lasting longer than one beat was the final note of the phrase. 

Photographic slides of the sequences were prepared which could be projected one by one onto a screen 
placed in front of subjects. The notes were so spaced that each melody took up three lines of manuscript and 
occupied an area of about 12 in by 8 in. Subjects were free to position themselves as they wished with 
respect to the screen. 


Subjects 


Six subjects served in the experiment. They were aged between 22 and 30, and were all accomplished 
keyboard sight-readers with normal or corrected vision. 


Procedure 


Subjects were tested individually on a small electronic organ where the projection screen occupied the 
position normally taken by a music stand. They were instructed that a test of sight reading ability was to be 
carried out in which the memory span for notes seen but not yet executed would be of particular interest. In 
order to investigate this the slide would be switched off without any warning at a point during the 
performance of a melody. At this point subjects would be required to carry on playing until they had 
exhausted their memory of the notes ahead. Subjects were warned not to attempt to guess in cases of 
uncertainty, but simply to stop playing. Such a procedure could tend to underestimate EHS, and to produce 
between-subject variation due to differences in criterion. However, had subjects been instructed to guess as 
much as they could, then their own knowledge of structure would have made it more likely that they 
produced the actual continuation in texts that were structurally well-formed than in those which made 
harmonic nonsense, and this could have produced a spurious increase in EHS for structurally marked 
phrases. Since the experiment was concerned with changes in EHS within subjects between conditions, 
accurate measures of maximum EHS were not crucial to the analysis, but a consistent criterion for a given 
subject was crucial. Subjects were also warned that not all of the melodies would sound quite conventional. 
Stimuli were presented in a mixed order such that members of the same matched set were well separated. 
Subjects chose their own performing speed for each piece, but were required to start playing within 5 sec of 
the onset of a slide. This was done to control against the possibility that subjects might memorize the critical 
phrase before actually playing it. It is true that, in normal circumstances, some musicians like to quickly 
scan through a piece before playing it, but such pieces may last many minutes, whereas the test pieces here 
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lasted only a few seconds. In this context, 5 sec is not a disproportionately short scan time, and certainly 
enough for an experienced reader to establish the key and an appropriate performing speed. As soon as a 
subject stopped performing on one piece the next slide was displayed. All performances were tape-recorded 
for subsequent analysis. 

The dependent variable for each trial was the number of notes correctly recalled in unbroken sequence 
immediately following the point at which the slide was switched off. This variable is referred to as the 
eye-hand span (EHS). 


Results 


For each trial on which the subject’s EHS coincided with a phrase boundary he was given a 
score of 1, and for each time this failed to occur he was given a score of 0. A 6x3x3x2x2 
analysis of variance was carried out on this measure (for subjects, length of phrase, distance 
from cut-off point to critical phrase boundary, presence of physical marker, and presence of 
structural marker, respectively). Of the main effects the structural markers was highly significant 
(F= 53-6, d.f. = 1, 5, P< 0-001). EHS coincided with a phrase boundary on 47-2 per cent of 
occasions when harmonic structure was present but only on 19-4 per cent of occasions when 
there was no harmonic structure. The effect of physical markers was also significant but less so 
(F=7-10, d.f. = 1, 5, P< 0-05). EHS coincided with a phrase boundary on 41-0 per cent of 
occasions when physical markers were present, but only on 25-0 per cent when they were 
absent. The length of the phrase was not a significant factor (F= 0-77, d.f. =2, 10), but the 
distance from cut-off point to the critical phrase boundary was significant (F= 5-72, d.f. — 2, 10, 
P< 0-025), the tendency to fall on a phrase boundary decreasing as the distance increased (45-8 
per cent for 5, 29-1 per cent for 7, and 25-0 per cent for 9). No interaction of these factors was 
significant suggesting that their effects are independent. A second analysis of variance was 
carried out on the actual values of EHS. This showed that the main effect of structural markers 
was again highly significant (F= 250-8, d.f. = 1, 5, P< 0.001). The mean EHS for sequences with 
harmonic structure was 5-5 notes as opposed to 4-5 notes for sequences without harmonic 
structure. On the other hand, the effect of physical markers was not significant (F = 0-65, 
d.f. = 1, 5) mean EHS for marked and unmarked sequences being 4-9 and 5-1 notes respectively. 
Two significant interactions also emerged — structural markers xdistance of critical boundary 
from cut-off point (F= 6-38, d.f. 22, 10, P< 0-025), and physical markers x phrase length 
(F = 8:13, d.f. 2 2, 10, P« 0-01). These interactions are shown in Tables 1 and 2 respectively. On 
examination of the distribution of EHS it was apparent that it deviated from normality. In 
particular, spans tended to cluster about the values 5, 7 and 9 which were the distances of the 
phrase boundaries from cut-off points. In order to examine the interactions more closely it was 
decided to use an appropriate non-parametric significance test. Thus Friedman tests were carried 
out at each level of the main marker effects and showed that in both cases EHS varied 
significantly when markers were present (y? = 6:33, P« 0-05; 42 — 9.30, P< 0-01 respectively), 
but not when markers were absent (y? = 2-59 and 0-81 respectively). In both cases EHS for 
marked phrases increased as the value of the interacting variable increased. The absence of a 
main effect for physical markers can be explained by the fact that the interaction functions 
overlap, whereas in the case of structural markers they do not. 

It should be noted that no interaction containing both the physical and structural marker terms 
was significant on either ANOVA, thus suggesting that the different marker types had an 
independent effect on performance. 


Discussion 


These results provide evidence that both physical and structural markers affect the eye-hand 
span of experienced sight-readers. Given that the effect of the two types of marker is 
independent then it would appear that structural markers are at least as important as physical 
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Table 1. Mean EHS as a function of presence of structural markers and distance from cut-off 
point to the critical phrase boundary 


Distance from cut-off point 


to critical boundary ... 5 7 9 notes 
Structural markers 
Present 5-02 5-47 6.02 
Absent 4-47 4:77 4-33 


Table 2. Mean EHS as a function of the presence of physical markers and phrase length 


Phrase length ... 5 7 9 notes 
Physical markers 
Present 4-36 4-50 5.97 
Absent 5-36 4-91 5-00 


markers. The increase in coincidence of EHS with phrase boundary is 27-8 per cent for 
structural markers as compared with 16 per cent for physical markers. Given that good 
sight-reading performance correlates with a large EHS, then we are justified in concluding that 
awareness of structural constraints in musical text is an important component of the 
sight-reading process. On the basis of two analyses carried out it is possible to suggest that the 
two kinds of markers may be affecting performance in rather different ways. First, structural 
marking always increases EHS, whereas physical marking may not necessarily do this. This may 
be shown to follow from the entailments of the ‘redundancy’ and 'unitization' hypotheses. 
Harmonic structure makes the text more predictable and allows the reader to process more of 
the text without passing some internal limit on informational capacity. Physical markers, on the 
other hand, do not decrease the information load of the message, they simply prescribe where a 
given analysis should stop and the next one begin. Thus in the cases where the nearest phrase 
boundary is closer than the average span of a subject, the physical marker may act to prevent 
analysis beyond that boundary for a short period of time, during which the iconic information 
decays below recognition threshold. One subject in fact remarked that interphrase spaces 
appeared to hold up the flow of processing momentarily: ‘I knew that I could relax for a 
moment as I approached a space'. If we take Table 2 and substitute for phrase length the mean 
distance of the nearest phase boundary corresponding to that phrase length, we find this value to 
be 4-5, 4-7, and 5-5 notes, for phrase length 5, 7 and 9 respectively. These values correspond 
rather closely to the mean values of EHS when physical markers are present (viz. 4-36, 4-50, 
and 5-97 notes, respectively), suggesting that EHS does indeed tend to be constrained to the 
nearest physical marker. This suggests that the time course of analysis progresses in leaps from 
one boundary to the next rather than in a continuous flow, and is consistent with the parallel 
processing hypothesis often invoked as a mechanism for the 'unitization process. 

Given that structural markers increase EHS, why should they also tend to cause EHS to 
coincide with a phrase boundary? The most plausible answer would appear to be that the 
informational constraints within phrases are much greater than those between phrases so that 
there is a considerable increase in information load immediately following a phrase boundary. 
Given that harmonic structure allows EHS to expand, then expansion to a phrase boundary is 
more likely than expansion beyond it, since the population of probable continuations increases 
sharply beyond a phrase boundary. Table 1 shows that EHS for sequences without structural 
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markers is about 4-5 notes. When structural markers are introduced EHS increases with the 
distance to the critical boundary. As the area of relatively low redundancy after the phrase 
boundary recedes so the EHS can expand towards that area. 

It appears, then, that Levin’s statement, that ‘readers have an elastic span, which stretches or 
shrinks to phrase boundaries’ is true of music readers, but must be qualified to indicate that 
structural markers of phrases cause EHS to expand within an area of high redundancy (i.e. up to 
a phrase boundary), whereas physical markers of phrases can cause EHS to expand or contract 
according tó whether the boundary is nearer or further than mean span. An important point ta 
note in considering the processing of musical text is that there is no clear division between 
‘acceptable’ and ‘unacceptable’ musical passages in the same way as there is between 
meaningful and meaningless verbal text. On the contrary, much contemporary music is 
‘unacceptable’ when judged by the rules of earlier styles. This means that the presentation of 
musical texts lacking familiar markers has some ‘ecological validity’ for musicians, who are 
likely to be presented with such texts outside the laboratory. The experience of such musicians 
leads one to suppose that they actively search for structures which would allow them to isolate 
the ‘units’ of the new language. It is possible that this search will, in itself, lower sight-reading 
performance below the level attainable were the reader to adopt a less analytic approach. 

The texts used in this experiment shared a very important surface feature with verbal texts, 
for they were single melodies in which the left-right sequencing exactly corresponded to the 
temporal sequencing of the musical sounds that they depicted. Much music also contains a 
vertical component, in that more than one note must be read and played at the same time. The 
need to have a vertical ‘span’ and to be aware of vertical structure could well have a significant 
influence on horizontal processing. However, the situation examined in this experiment is the 
only valid one for a large class of musical instruments (e.g. wind instruments, and human voice) 
which are incapable of producing more than one note at a time. The fact that pianists and other 


keyboard instrumentalists may need to develop rather more complex skills to deal with the 
vertical component should not invalidate these results. 

The fact that music readers use structural as well as physical cues in the analysis of musical 
text suggests analogies between music and language processing at fairly high levels of 
abstraction. Slobuda (1976) has already shown such analogies to hold rather strongly at lower 
levels of visual analysis, and these studies taken together begin to suggest that musical cognition 
may be a non-verbal! skill which has the kind of characteristics usually associated only with 
language. It is too early to speculate on the implications that this conclusion may have for our 


notions of language behaviour. 
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Sex differences in the emotional behaviour of laboratory mice 


John Archer 





Three laboratory strains of mice (C57, BALB/c, Porton) were tested in an open field using a number of 
measures taken during the initial 2 min of testing and after a loud bell had sounded. The mice were tested 
either in a clean field or in one containing the odour of a same-sex conspecific. Over all three strains, several 
measures indicated that females showed higher levels of ‘emotional’ or ‘fear’ responses than did males. 
This sex difference was found in C57 and BALB/c strains but not in the Portons; it could not be attributed to 
activity or response-specific sex differences. Conspecific odours did not influence the direction of the main 
findings. 





A recent study of three strains of rat investigated sex differences in their open field behaviour 
(Archer, 1974), measuring defaecation, reluctance to leave the periphery, and flight and freezing 
responses to a startling stimulus. Time samples of the animals’ behavioural responses were also 
recorded so that the behavioural measures extended beyond the usual defaecation and 
ambulation counts. The results of this study provided no indication that males were more 
‘fearful’ than females, as had been suggested by Gray (1971) mainly on the basis of more 
restricted open field measures. The more detailed measures revealed some sex differences among 
the rats, but the precise nature of these depended on the strain, the test measure, and whether 
conspecific odours were present. 

Studies of mouse open field behaviour have revealed few cases of sex differences in 
ambulation and defaecation, and these have not been consistent in direction (see review by 
Archer, 1975). Gray (1971) has claimed that the reason for this arises from the widespread use of 
inbred strains. His argument rested on a finding by Bruell (1969) that sex differences in open 
field defaecation were greater in hybrids from a diallel cross of five inbred strains than in the 
parental lines. Although this has been supported by other studies using diallel crosses from 
inbred lines (e.g. Henderson, 1967), it was not confirmed when a genetically more heterogeneous 
sample, derived by mating eight inbred lines of laboratory mice, was used (Blizard, 1971). 
Blizard argued that his method sampled a wider variety of genetic interactions, and thus was 
more representative of the species as a whole. His findings must, therefore, provide a strong 
argument against Gray’s inference from Bruell’s results to ‘outbreeding’ in general. 

Any trend towards an absence of sex differences within inbred lines of mice cannot therefore 
be regarded simply as a consequence of inbreeding. In fact, there are reports of inbred lines 
showing sex differences but the direction is not consistent (Archer, 1975). Henderson (1967), for 
example, found that males generally defaecated more than females for CS7BL, DBA, and C3H 
strains, whereas the reverse was the case for BALB/c mice. Three other reports, however, 
noted a reversal of the sex difference for C3H strain (Candland & Nagy, 1969; Nagy & Forrest, 
1970; Nagy & Holm, 1970), and other investigators have found no sex differences in defaecation 
for BALB/c and C57BL mice (e.g. Dixon & DeFries, 1968; Nagy & Glaser, 1970). Differences in 
ambulation, with females ambulating more than males, have been reported in some cases and 
(e.g. DeFries, 1964, using BALC/c and C57BLs; Nagy & Glaser, 1970, using C57BLs; and 
Streng, 1971, using several strains), whereas other reports indicate no such sex difference (e.g. 
Dixon & DeFries, 1968, using C57BL and BALB/cs; Nagy & Holm, 1970, using C3H mice; 
Blizard, 1971, using an outbred sample, see above), and Bruell (1969) has reported the reverse 
sex difference using five inbred strains and their hybrids. 

In view of these inconsistencies, and of the limited number of measures used in most studies 
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(i.e. ambulation and defaecation: see Archer, 1973), a further study of sex differences in open 
field behaviour was carried out on three strains of mice: a wider variety of test measures was 
used in this study than in previous ones (cited above). Mostly, the measures were chosen to 
represent aspects of the animal's initial emotional or ‘fear’ responses to novel and startling 
stimuli, rather than being concerned with the cumulative effects of several exposures to the test 
situation, which has been the method most commonly used in previous studies (see Archer, 1973, 
pp. 229-231, for a general consideration of methodological issues). Also in view of studies on 
mice indicating the importance of conspecific odours in this type of test (e.g. Whittier & 
McReynolds, 1965; Jones & Nowell, 1972), a comparison of mice tested in a relatively clean open 
field was made with those tested in the presence of an odour from a mouse of the same sex and 
strain. 


Method 
Subjects 


Adult male and female mice, 60 of each sex, aged approximately three months, and housed five to a cage, 
were tested once in an open field. The sample comprised 40 C57 black BUA (from Sussex University stocks, 
initially obtained from University of Birmingham, Anatomy Department), 40 BALB/c (obtained directly from 
MRC laboratories, Carshalton), both inbred lines, and 40 ‘Porton’ mice (from Sussex University stocks, 
initially a random bred albino mouse, obtained from Allington Farm, Porton Down, and originally derived 
from a colony of white Swiss mice). 


Apparatus 


Two open fields were used: each one was black in colour, measuring 30x30x 11 cm, and the square-shaped 
floor was marked into 16 smaller squares. The procedure was the same as that used in the previous study of 
rat sex differences (Archer, 1974): two male mice from one strain were tested consecutively in one open field 
and two females, also from the same strain, were tested in the other open field, on each day. Only the faeces 
were removed in between testing the two mice. Each open field was washed after the second mouse had 
been tested. Thus, the first mouse tested on any particular day was exposed to a relatively clean apparatus 
(termed the NOD or ‘no odour’ condition) whereas the second mouse was exposed to an open field 
containing the odour of a conspecific (OD or ‘odour’ condition). The three strains were tested consecutively 
so that the design was that of three separate experiments rather than a 3x3x2 factorial design. 


Testing procedure 


The testing procedure (Archer, 1973) comprised a 2 min exposure to the open field, followed by the ringing 
of a loud bell for 5 sec, after which there was a further 1 min observation period. The measures taken were 
tume samples every 10 sec of behaviour occurring during the first 2 min, classified as walking and sniffing 
(WS), stationary and sniffing (SS), walking or running (W), rearing (R), immobility (I), and grooming (G). In 
addition, various measures likely to be associated with emotional responsiveness were taken: defaecation 
before the bell sounded, defaecation after the bell, latency to enter the centre squares, the number of 
squares crossed while the bell was sounding (‘flight distance’), and latency to (a) move the head, (b) move a 
leg, and (c) leave a square, after the bell had sounded: the means of the last three measures were calculated 
for tabulation of the results. 


Analysis of results 


Analysis of the results was carried out as before (Archer, 1974), first overall sex differences, second sex 
differences within the three strains treated separately, and third a comparison of OD and NOD conditions, 
within each sex and strain (with an assessment of the possible effect of any differences on sex differences). 


Results 
Overall sex differences 


Table 1 shows the combined sex differences for the three strains. Only three measures were 
significantly different at the 0-05 level: males moving to the inner squares sooner than females, 
more females showing immobility (I) during one or more time sample (calculated because the 
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Table 1. Overall sex differences in open field behaviour (medians, unless stated). Time sample 
measures which were uncommon are not included 


————— MM M MM 


Measure Males (n — 60) Females (n = 60) P* 
a 
Latency to inner squares (sec) 48-5 120 0-01 

Open-field defaecation (boluses) 1 2 n.s. 

Walk-sniff (time samples) 5 4 n.s. 

Stationary-sniff (time samples) 5 6 n.s. 

Numbers of mice showing immobility 5 14 0-05-0-02** 


in one or more time samples 


Bell: flight distance (squares entered) 1-5 3 0-03 
Bell: mean latency to move (sec)t 10-1 11-5 n.s 
Bell: defaecation (boluses) 1-5 2 n.s. 


ee M M MHMHHÉ—M—À 
* Wilcoxon matched-pairs test, two-tailed (Siegel, 1956), except ** (x test, two-tailed). 
+ Calculated from mean of three measures (see Methods). 

j r 
Table 2. Open field behaviour of C57BL mice: Analysis by sex and odour condition (medians). 
Time sample measures which were uncommon are not included 








Males Females 

Sex differences —— 

—— No No- 

Male Female odour Odour odour Odour 
Measure (n= 20) (n = 20) P* (NOD) (OD) P* (NOD) (OD) P* 
Mor ENDO cru E e UU 
Latency to inner squares (sec) 275 350 005 25 28 n.s. 29-5 4i 0-05 
Open-field defaecation (boluses) 0 0 n.s. 0 0 n.s. 0 0 n.s. 
Walk-sniff (time samples) 5 6 002- 5 6 0-05 5-5 6-5 n.s. 

0-05 
Stationary-sniff (time samples) 4 2 0-01 5 3 005- 3 2 n.s. 
0-02 
Bell: flight distance (squares 1 3 0.05- |! 25 ns. 2 35 ns. 
entered) 0-02 ' 

Bell: latency to movet (sec) 11 9.7 ns. 11 11 n.s. 6-1 123 ns. 
Bell: defaecation (boluses) 0-5 0-5 n.s. 0-5 0-5 n.s. 1 0 n.s. 








* Wilcoxon matched-pairs test, two-tailed (Siegel, 1956). 
+ Calculated from the mean of three measures (see Methods). 


individual numerical values were small), and females showing longer ‘flight distances’ after the 
bell had sounded. 


Sex differences within strains 


The left-hand columns of Tables 2, 3 and 4 show the sex differences for each strain. 

In the C57 strain, males moved to the inner squares sooner, showed shorter flight distances, 
more stationary-sniffing (SS), and less walking-sniffing (WS), than did females. 

In the BALB/c strain, males again moved to the inner squares sooner, and they also showed 
significantly less defaecation after the bell, less combined defaecation over all 4 min, and 
significantly more WS, than did females (NB This last result is the opposite to that found in the 
C57s, and also for rats in the previous study). 
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Table 3. Open field behaviour of BALB/c mice: Analysis by sex and odour condition (Medians, 
unless stated) 











Males, Females 

Sex differences ———————— 

— No- No- 

Male Female odour Odour odour Odour 
Measure (n=20) (n=20) P* (NOD) (OD) P* (NOD) (OD) P* 
Latency to inner squares (sec) 114 120 «0-01 66 120 n.s. 120 120 n.s. 
Open-field defaecation (boluses) 2 3 n.s. 2 1 n.s. 3 3 n.$. 
Walk-sniff (time samples) 4 3 0-05- 5 35 ns. 2.5 3 n.8. 

0-02 
Stationary-sniff (time samples) 8 8-5 n.s. 7 85 ns. 8-5 8 n.s. 
Bell: ftight distance (squares 1 3 ns. 1 0-5 ns. 3 2-5 n.s. 
entered) 

Bell: latency to movet (sec) 21 37 ns. 20 25.] ns. 40 27.1 n.s. 
Bell: defaecation (boluses) 2 3 0-5 2 2 n.s. 3 3 n.s. 
Total defaecation (boluses) 4 6 0.02 5-5 3.5 ns. 6.5 6 n.s. 


a 


* Wilcoxon matched-pairs test, two-tailed (Siegel, 1956). 
t Calculated from the mean of three measures (see Methods). 


Table 4. Open field behaviour of Porton mice: Analysis by sex and odour condition (medians) 











Males Females 
Sex differences —_—_—_———_—__- 
No- No- 
Male Female odour Odour odour Odour 
Measure (n=20) (n=20) P* (NOD) (OD) P* (NOD) (OD) P* 
Latency to inner squares (sec) — 120 120 n.s. 120 109 n.s. 120 120 n.s. 
Open-field defaecation (boluses) I 2 n.s. 1 1 n.s. 0 2-5 n.s. 
Walk-sniff (time samples) 3 3 n.s. 3-5 3 n.s. 2 3.5 0-05- 
0-1 
Stationary-sniff (time samples) 5 6 n.s. 5 65 n.s. 8 55 <0-01 
Rear (time samples) 1 1 n.s. 2 0-5 n.s. I 2 0-05- 
: 0-1 
Bell: flight distance (squares 2 3 n.s. r5 25 ns. 3 3 n.s. 
entered) 
Bell: latency to movet (sec) 3.5 76 ns.** 333 4 n.s. 7.8 7-8 n.s. 
Bell: defaecation (boluses) 1-5 1 n.s. 1:5 1 n.s. 1 1:5 n.s. 








* Wilcoxon matched-pairs test, two -tailed (Siegel, 1956). 
** P=0-02-0-01 for latency to move leg. 
t Calculated from the mean of three measures (see Methods). 


Porton mice showed only one significant sex difference: females took longer than males to 
resume leg movement after the bell. 


Differences between NOD and OD conditions in relation to sex differences. 


For males, of the C57BL strain, the OD mice showed more active exploration (WS) and less 
passive exploration (SS) than the NOD mice did (Table 2: centre column). For female CS7BLs, 
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the OD mice took slightly longer to enter the centre squares, and on the ‘latency to move head’ 
measure remained immobile for longer after the bell (P = 0-05-0-02), but this difference was not 
found for the other two latency scores. 

In C57BLs, the main effects of odour condition on sex differences were on two measures, SS 
and latency to enter the centre squares. The sex difference for SS (Table 2, left-hand column) 
was reduced to a non-significant level under the OD condition, whereas a higher level of 
significance (P= 0-01) was obtained for mice tested in a clean open field. On the other hand, 
latency to the centre was reduced to a non-significant level in the NOD condition, but showed a 
larger difference (P= 0-05-0-02) under the OD condition. 

In BALB/c mice, none of the differences between NOD and OD mice were significantly 
different (Table 3: centre and right-hand columns). Neither were these differences significant for 
male Porton mice (Table 4). 

In female Portons, however, SS was significantly lower for the OD condition (WS and R 
approached significance). These differences did not exert any appreciable effect on sex 
differences, which were at a non-significant level for both odour conditions. 


Di à 
For all three strains combined, females showed longer latencies to enter the central squares, 
moved further in response to the bell, and more of them showed immobility during a time 
sample than did males (Table 1). Thus, females showed higher scores than males for three 
measures of emotional behaviour: these results (and the direction of the differences for other 
measures shown in Table 1) are in the opposite direction to Gray's (1971) hypothesis that male 
rodents are more ‘fearful’ than females (see Introduction). Gray's claim that inbred mouse 
strains were less likely to show sex differences than outbred ones was also not supported by the 
present findings, since the (outbred) Porton mice showed fewer significant sex differences than 
the other two (inbred) strains did. 

Analysing the two inbred strains separately, it was found that their sex differences were 
similar in general direction to the results for the combined sample. In the C57 strain, females 
showed longer latencies to enter inner squares, and longer flight distances, but no evidence of 
immobility differences. BALB/c females also showed longer initial latencies, and also more 
defaecation than males, but there were no significant differences for flight and immobility. 

The significance of these measures is now discussed in more detail. Flight distance and 
immobility are fear responses, one being an active the other a passive response. Thus the finding 
that females showed higher scores for both measures (Table 1) indicates that the sex difference 
is neither specific to one form of fear response nor is a secondary consequence of activity 
differences. (Elsewhere it is argued that these two factors can explain some of the reported rat 
sex differences: Archer, 1975.) 

A third response, defaecation, has often been used in open field studies, to measure 
‘emotionality’ (see Introduction; also Archer, 1973). The only sex difference in defaecation 
found in the present study was in the combined score both before and after the bell by BALB/c 
mice, females defaecating more than males. Henderson (1967) reported a similar sex difference 
for BALB/c mice, using a single test period of 5 min (i.e. a similar duration to the present one). 
Other results for BALB/c mice have shown either more defaecation by males (McReynolds, 
Weir & DeFries, 1967; Bruell, 1969; Streng, 1971) or no sex difference (Dixon & DeFries, 1968; 
Sayler & Salmon, 1971), but these studies all involved repeated testing or a single long period of 
10 or 20 min. Possibly, initial defaecation is more related to emotional responsiveness (females 
scoring slightly higher) whereas over longer periods of testing, males defaecate more because of 
odour deposition (a suggestion made by Collins 1966; see also Archer, 1975). 

A fourth measure was the latency to enter the centre of the open field: this has not been 
widely used before, and the only available comparative data are those obtained in the similar 
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study on rats (Archer, 1974). Here a sex difference in the same direction as the present one 
(males showing shorter latencies) was found for the Sprague-Dawley strain, in which the 
females also showed more defaecation (as in the BALB/c mice: see above). Other studies, using 
different types of latency to emerge (e.g. from the home cage to a novel environment, from a 
start-box to an open field) have typically shown longer latencies in males than females (e.g. 
Woods, 1962; Quadagno, Shryne, Anderson & Gorski, 1972). It is argued elsewhere (Archer, 
1975) that these longer latencies by males reflect their generally lower activity levels, found in 
many situations. Since the present findings are in the opposite direction to these, they cannot be 
accounted for in the same way. In view of the other differences discussed above, it is more 
likely that they form part of a pattern of greater emotional responsiveness by females of the 
C57BL and BALB/c strains, than by their male counterparts. 

The precise significance of this sex difference cannot be deduced from the present results 
alone, but some preliminary suggestions can be made. It seems to be confined to short-term 
responses to relatively large environmental changes, since comparable findings have not been 
observed over longer test periods or with repeated testing (see Archer, 1975, for review of these 
studies). The sex difference may be a product of the particular environmental and social 
conditions under which laboratory mice are housed (unisexual groups in the present case), or it 
may be of more widespread occurrence, stable despite considerable environmental variability. 
This question is particularly important in deciding whether the difference is of functional 
significance. 

Finally, brief mention is made of sex differences in activity, and the influence of odours. 
Typically, previous work on rats and mice using repeated tests shows females to be more active 
than males (Archer, 1975). In the present study, this sex difference was apparent only for the 
C57 mice. In the other two strains, responses more specific to novel and startling stimuli may 
have obscured the more usual sex difference in activity. 

Testing rats in the presence of a conspecific odour affected several open field measures 
(Archer, 1974): however, odours exerted a less pronounced effect in the present study. In female 
Porton the male C57BLs, there was an indication of more active and less passive exploration 
when the odour was present, and this is comparable with previous findings by Whittier & 
McReynolds (1965) and by Van Oortmerssen (1971). These differences between odour conditions 
produced few changes in sex differences, again in contrast with the previous study on rats. 

In summary, therefore, several measures from BALB/c and C57BL mice showed sex 
differences indicating that females showed more pronounced initial emotional responses, such as 
defaecation, freezing, fleeing, and restricting their movements to the wall of the novel arena. 
The typical sex difference in activity was only found for C57BLs, and the presence or absence 
of same-sex conspecific odour was not found to affect the general direction of the sex difference 
in emotional responses. 
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Non-verbal and verbal behaviours associated with parting 


Angela B. Summerfield and J. A. Lake 





The aim of the study was to examine the effect of sex of subject and acquaintance on the behaviour of 24 
pairs of same-sex subjects at a particular transition point, the closure of a meeting. Each pair of subjects 
was asked to discuss a topic of its own choosing for 20 min. They were instructed that a buzzer would then 
sound and they should open envelopes which instructed one or both of them to go elsewhere for the rest of the 
experiment. The procedure was videotaped. Subjects subsequently completed a questionnaire giving their 
perceptions of themselves and their partners. It was found that friends showed more verbal and non-verbal 
interaction during parting than did strangers. Differences related to both sex and degree of acquaintance 
were found in the questionnaire responses. The results are discussed in relation to social skills and to 
differential expectations of future interaction. 


The Eskimos had no word for farewell in their language, but came and went without ceremony. ..‘I am 
going’ I said again, using their only greeting of farewell; and they answered together ‘You are going’. 
(Diamond Jenness, 1964) 

Argyle & Kendon (1967) have suggested that social interaction is characterized by successive 
sequences of behaviour, each selected as appropriate to a particular stage in a relationship. They 
note that part of a social skill consists of correctly recognizing a cue and providing the 
appropriate response to it. This is particularly important at what the authors have earlier 
described as ‘transition points’ in social interaction. Steer, Charles & Lake (1973) examined the 
verbal and non-verbal behaviour of subjects who carried out two visual preference tasks 
together after a period of initial discussion. The subjects were required to write down their joint 
decision for each task and evaluate the quality of their performance when both tasks were 
completed. They defined a ‘transition point’ as a change in goal by the participants, as for 
instance when they completed their preliminary discussion and started work on the first task. 
Significant changes in eye contact, touching and vocalization were associated with such 
transitions. 

Van Gennep’s (1908) analysis of rites of passage helps to elucidate the nature of transition 
points. Van Gennep noted that changes in social position were associated with particular rites. 
Each transition involved three phases, namely separation from the previous social status, an 
actual transition phase and, lastly, incorporation into the new social status. The rites for each 
phase were distinctive. 

Partings are a special example of a transition point. Research on their characteristics has had a 
predominantly sociological emphasis. Sissons (1971) in ‘The Paddington Station Experiment’ 
studied interactions in which an inquirer asked the way. She demonstrated gross social class 
differences in both verbal and non-verbal behaviours at partings. Working class subjects showed 
fewer formalized responses at the end of partings. Schegloff & Sacks (1973) carried out a 
detailed linguistic analysis of the verbal content of partings from a sociological point of view. 
There is also anthropological evidence on cultural differences in transitional behaviour (for 
example, Basso, 1970). 

In psychological terms, partings can be expected to have up to three stages of the type 
detailed above. The preparatory phase may be absent when the parting is initiated from outside 

‘the interaction and is not expected by the participants. It is suggested that the amount of 
involvement participants shown in parting is proportionate to their investment in the relationship, 
and that cultural determination of greater affiliative behaviour in women than men is likely to 
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increase further the amount of interaction by women subjects during partings. Participants’ 
perceptions of an interaction and their actual behaviour is often different. For this reason the 
present study was designed so that the hypotheses could be tested not only by recordings of 
verbal and non-verbal behaviour but also by the subjects’ responses to a questionnaire 
administered after the parting in which their feelings about the interaction and their partner were 
examined. Since it was thought that partings might be arbitrarily affected by whether one or both 
members of a dyad were obliged to leave, this variable was also included. 

It has been fashionable of late to study social behaviour in ‘natural’ settings. It can be argued 
that it might be more scientifically profitable to concentrate on ones which are ‘valid’. In other 
words, the ‘naturalness’ of a party or a street is no guarantee that the participants will see all 
interactions taking place there as being appropriate, timely, or indeed sufficiently important to 
engage their attention. Furthermore, many ‘natural’ settings often have drawbacks for the 
experimenter in that they necessarily involve interruptions, noise and other invasions which 
detract from a well-controlled experimental environment. In this study an attempt was made to 
set up a valid interaction involving parting in a laboratory setting. It was assumed that volunteer 
adult subjects would intend to pay a reasonable degree of attention to the experimenters’ 
instructions and would regard it as appropriate for the experimenters to request them to stop or 
change a particular activity in which they were engaged. Within these constraints subjects were 
to be asked to complete a conversation and separate so that they could take part in the rest of 
the experiment in other parts of the building. This would seem to constitute a valid parting for 
the subjects in that they would accept the instruction from the experimenter and understand the 
reason for which it has been given. 

In summary, it was hypothesized that subjects who were friends and pairs of female subjects 
would interact more during partings than strangers or male pairs. It was also hypothesized that 
verbal and non-verbal interactions would be affected by whether one or both subjects had to 
leave the experimental room. 


Method 

Subjects 

The subjects were 24 men (mean age 24-29 years, range 18-38 years) and 24 women (mean age 25-27 years, 
range 18-47 years). They were either students or employed in a variety of technical or professional 
occupations in Social Classes II and IIIa (Registrar General's Classification). Six subjects of each sex were 
asked to bring a friend of the same sex to take part in the experiment. The other 12 subjects were paired with 
strangers. 


Apparatus and materials 


Recording equipment. Recordings were made using Shibaden and I.V.C. closed-circuit television equipment 
and a TRD DPA/1 tape-recorder with an AKG lavalier microphone. Optimum recording levels for sound and 
vision were previously determined and kept constant across subjects. 

Measures of length of speech and of looking at the other person were taken using two circuits, each 
consisting of a Birkbeck timer, a Venner stopclock and a response key, which were used by two observers. 


Subject questionnaire. A subject questionnaire was constructed in order to obtain personal information about 
the subjects and their perceptions of themselves and their partners. Bipolar continuous rating scales 10 cm 
long were used for measuring evaluative judgements. 


Procedure 


At the beginning of each testing session, the two members of each pair were given separate written 
instructions in which the experimenters explained that they were carrying out a study of friendship. The 
subjects were then taken into the laboratory and seated 2% ft apart at each side of a table facing each other. 
The experimenters instructed them to discuss any topic they wished for 20 min in order to get to know each 
other better. They were told that at the end of this period a buzzer would sound and they should open sealed 
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envelopes which would provide instructions about the next stage of the experiment. It was explained that 
confidential sound and vision recordings would be made. At the end of the 20 min conversation, the 
instructions requested one or both subjects to go to other rooms for the rest of the experiment. On arrival 
they filled in the subject questionnaire. The entire encounter was recorded from the onset to the time when 
the last subject in each dyad who had to leave the laboratory closed the door behind him. 

There were three independent variables, degree of acquaintance (friends-strangers), sex of dyad 
(male-female) and go-stay (one-both subjects left the laboratory). Pairs assigned to different treatments were 
tested in random order. At the end of the experiment the purpose of the study was explained to the subjects 
and their questions were answered. None of the subjects were aware that the experiment was concerned 
with parting and they found the manipulation realistic. 


Results 
Verbal and non-verbal behaviour 


The closure period was defined as the time interval from the buzzer until the last subject to 
leave shut the laboratory door. The mean length of the closure period was 36-16 sec with a range 
of 21-55 sec. A three-way analysis of variance (acquaintance x sex x go-stay) was carried out on 
time to closure. No significant main effects or interactions were found. It was therefore assumed 
that length of interaction would not have contaminated the other results. Social interaction 
during closure was then examined. Two measures, number of words spoken and amount of 
looking at the other person, were taken. 

The mean number of words spoken during closure by each dyad was 36-24 with a range of 
7-109. 

Because of the distribution of the data, a square root transformation was carried out and a 
three-way analysis of variance (acquaintance xsex Xgo-stay) was performed on the transformed 
variable. A significant effect was obtained for acquaintance (F= 5-157, d.f. = 1, 16, P< 0-05) 
indicating that friends spoke significantly more words during closure than did strangers. No other 
significant main effects nor interactions were found. 

The conversations during closure were transcribed. Use of conventional farewells such as 
* Cheerio' or ‘Goodbye’ were used by six pairs of friends and three pairs of strangers. Such 
phrases were taken as evidence of completion of the second phase of transition. Eight pairs of 
friends and four pairs of strangers read out their instructions to their partners. 

The mean amount of time spent looking at the other person by each partner during closure was 
5-70 sec with a range of 0-45-21-9 sec. A three-way analysis of variance (acquaintance x sex x 
go-stay) was carried out. A highly significant effect was obtained for acquaintance (F = 27-13, 
d.f. = 1, 40, P« 0-001), indicating that friends looked at their partners more than strangers did. 
No other significant main effects nor interactions were obtained. 


Questionaire responses 


Differences in the behaviour of subjects during closure were considered further in the context of 
their perceptions of themselves and their partners. The most striking result was the lack of 
correspondence between the behavioural data and subjects’ responses to the questions 'How 
satisfying was your conversation with the other person'. A significant effect was obtained for 
sex of subject (F= 11-27, d.f. = 1, 44, P« 0-01) and for question (F = 7-02, d.f. = 1, 44, 
P« 0-025). Women showed more appreciation than men on both questions and all groups 
liked the other person more than they felt satisfaction with the interaction. This is in contrast to 
the lack of sex differences in the behavioural data, where friendship differences were found 
instead. No other significant main effects nor interactions were obtained except for a significant 
interaction of sex xacquaintancex question (F= 7-78, d.f. = 1, 24, P< 0-01). 

As would obviously be expected, friends thought they had known their partners for longer 
time than did strangers. The questions ‘How long have you known the other person’ and ‘How 
well do you know the other person’ were examined together. An analysis of variance 
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(acquaintance x sex between subjects, question within subjects) was carried out. A significant 
effect was obtained for acquaintance (F= 41-98, d.f. = 1, 44, P< 0-001) and for question 

(F = 37-46, d.f. = 1, 44, P< 0-001). 

Discussion 

The results indicate that although the various treatment groups did not differ in the amount of 
time they took to close the meeting, they differed substantially in their amount of social 
interaction. Friends showed significantly more interaction than strangers as measured both by 
number of words spoken and by amount of looking during closure. In the more general context 
of the whole encounter, friends thought they knew each other better and for longer. Female 
subjects were more satisfied with the encounter and liked each other more. 

The most notable aspect of the findings appears to be the marked differences in interaction 
shown by friends and strangers, compared with much lesser differences in their perceptions of 
the interaction. Indeed, significant differences between males and females were found in the 
questionnaire responses alone. 

Taking first the behavioural differences between conditions, the greater amount of interaction 
shown by friends in comparison with strangers may well have reflected their expectations of 
future interaction. For friends, interaction during closure probably represented a transition from 
one meeting to the next, even if specific arrangements were not made at the time. It is doubtful 
whether this was so for the strangers. This result is not necessarily suggested by common sense. 
It could be argued that friends would have a smoother well-established routine for parting than 
Strangers, and consequently would complete the interaction more quickly and with fewer words. 
That was not the case here. 

The lack of sex differences in the behavioural measures is surprising in that women subjects 
. expressed greater appreciation of the interaction when completing their questionnaires. One 
possible explanation is that a lack of sex-role differences in interaction is specific to well-educated 
subjects who are used to working and studying with members of the opposite sex on equal 
terms. Alternatively, it may be an unwarranted assumption that the evaluative components of 
attitudes to others have behavioural correlates during short interactions, particularly in the 
task-directed ambience of a laboratory setting. 

The present exploratory study has attempted to chart some of the characteristics of closing a 
meeting in a laboratory setting. The most significant unanswered question would seem to be 
under what circumstances perceptions of such interaction converge with the actual behaviour, if 
they do at all. 
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Book reviews 


Gaze and Mutual Gaze. By Michael Argyle and Mark Cook. Cambridge: Cambridge University Press. 1976. 
Pp. xi+201. £6.50. 


In ‘Gaze and Mutual Gaze’ the authors achieve a novel examination in depth of the phenomena of eye 
contact and looking in social interaction. In attempting this they have brought together research findings 
which were distributed among quite disparate sources. Their conjunction is valuable not only for reasons of 
convenience, but because of the intellectual synthesis which is made possible. 

Mr Argyle and Dr Cook approach the findings in the first instance from biological and cultural standpoints, 
giving necessary perspective to more strictly psychological work. They devote a whole chapter to the 
measurement of gaze. Such a lengthy account is well merited given the considerable technical difficulties 
with which investigators have to come to terms. This chapter states the issues clearly and has some excellent 
illustrations. The chapters reviewing psychological studies give comprehensive coverage of the literature, 
although greater theoretical interpretation would have been useful to the specialist reader. Studies of gaze 
are particularly susceptible to the details of the experimental environment and many apparently contradictory 
and puzzling results can be resolved by such a detailed analysis. It is pleasing to see specific reference to the 
practical implications of the research cited, but many readers might wish that this had been taken even 
further, given the authors’ expertise in social skills training. The bibliography is of great value in itself. The 
subject index could have been more detailed. 

Gaze and Mutual Gaze is to be welcomed because it attempts to classify, analyse, and integrate studies in 
a specific aspect of non-verbal communication. It is an advance on the rather descriptive accounts which 
have been characteristic of the field. Necessarily, the interrelation of gaze and other non-verbal behaviour is 
de-emphasized. The book will be essential for specialists in non-verbal communication and attractive to other 
readers who prize a considered and well-documented account of one aspect of social behaviour. 

ANGELA B. SUMMERFIELD 


Models of Madness, Models of Medicine. By Miriam Siegler and Humphrey Osmond. London: Collier- 
Macmillan. 1976. Pp. 287. £5.00. 


One of the consequences of the phenomenal rise in the popularity of the conspiratorial view of madness, as 
popularized in a variety of ways by Szasz, Goffman, Scheff and Laing, a rise which appears now to have 
peaked and may even be in decline, has been a vigorous reassessment of the medical role in psychiatry and a 
burgeoning analysis of the professional foundations on which rests the clinical and scientific authority of the 
practising psychiatrist. Miriam Siegler and Humphrey Osmond, from the outset, have been active in 
formulating a response to this latest challenge and the combination they represent, namely a medical 
sociologist who is prepared to examine sympathetically the relationship between the sick role of the patient 
and the aesculapian authority of the doctor and a practising psychiatrist familiar with the fragmented state of 
his subject and long exposed to the din of competing ideologies, goes a long way to explain the pungency, 
vitality and force which have been features of their writings over the past two decades. 

Those who are familiar with these writings will find little that is new in their latest offering. Once again 
they outline their models of madness. Three — the medical, the moral and the impaired - they describe as 
discontinuous in the sense that they embody a partial or restricted view of the problem of madness. Five - 
the psychoanalytic, the family interaction, the psychedelic, the social and the conspiratorial — are classed as 
continuous in that like the religious and cosmological models from which in part they appear to be derived, 
they offer an inclusive, global picture of human destiny in which our various misfortunes play a significant 


Unashamedly they endorse the inherent superiority of their so-called medical model over all its rivals in 
unlocking the secrets of psychological ill-health and in easing the mental anguish of its sufferers. In so far as 
there have been dramatic advances made in the areas of research and treatment of the functional psychoses 
and the neuroses, these have occurred within the confines of the medical approach. The medical approach 
does not stigmatize, preach, convict relatives of the crime of making their patients mad, indict political 
systems for producing schizophrenia. The sick person who conforms to the sick role may be seen as heroic, 
unfortunate, a bit of human wreckage. That he is categorized by many sociologists as ‘deviant’ is blamed by 
Siegler and Osmond on Parson's ‘error’ in perversely applying this adjective to ‘a universal role learned so 
early in life’. The doctor's aesculapian authority 1s seen to rest on his sapiential authonty, namely his right 
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to be heard by reason of knowledge or expertness, his moral authority, as expressed in the Hippocratic 
Oath, and his charismatic authority, by which is meant that element in his role which reflects the original 
unity of medicine and religion and the many unknown and unknowable factors inherent in physical and 
mental disorder. 

The case for the defence is made out with skill and even daring. At times it eschews logic for pyrotechnics 
and its tendency to roam outside the narrow confines of the medical and sociological literature, while giving 
it a refreshingly lively and readable quality, does somewhat diminish its acuity. More worrying, however, is 
the fact that such has been the enthusiasm shown by Siegler and Osmond down through the years for 
engaging in battle with the protagonists of their various alternative models there is a tendency for many to 
assume that their ‘medical model’ and the medical approach to psychiatric illness is one and the same thing. 
They, perhaps unwittingly, encourage their readers to believe this by their neglect of those who, while 
likewise endorsing a medical and biological approach, refuse to accept the dichotomous formulation at the 
heart of the Siegler-Osmond philosophy. An example of this is the way in which these two authors appear to 
believe that there is a fundamental antagonism between the biological and the behavioural (classified by them 
as ‘moral ') ‘models’. Another example can be seen if we can compare say Aubrey Lewis’s medical model 
(though he would have never succumbed to using such an inelegant and opaque term). ‘Whether the 
constitutional factor is the predominant or determining influence, or the environmental one’, he once wrote, 
‘is never a question of kind, never a question to be dealt with as an “either/or” problem’. But it is precisely 
as an ‘either/or’ problem that it is dealt with by Siegler and Osmond and, in forcing such an artificial and 
unnecessary division, they do seem to be contributing to the very ‘model muddlement' for which they 
castigate the community enthusiasts and the social theorists. 

The book concludes with an analysis of the various models as they might be applied to the problems of drug 
addiction and alcoholism. Here, again, the unease experienced earlier in the book surfaces. They may be 
right about their whole-hearted endorsement of their ‘medical model’ in the identification and management 
of alcoholism. They may be wrong. But they are certainly indifferent to those negative consequences of the 
disease concept of alcoholism acknowledged even by psychiatrists who in other psychiatric conditions 
readily endorse such an approach. Indeed, their defence of the biological approach to the understanding of 
psychiatric disorder and of the interaction between the sick patient and psychiatric clinician would have 
benefited, and benefited enormously, by a more ready willingness to acknowledge the shortcomings. Even 
the most sympathetic reader is going to be asking ‘Where are the warts?’ 

ANTHONY CLARE 


Theoretical and Experimental Bases of the Behaviour Therapies. Edited by M. Philip Feldman and Anne 
Broadhurst. Chichester: Wiley. 1976. Pp. xi--459. £12.00. 


The aim of this book is stated by the editors to be that of building further bridges between theory and 
practice in clinical psychology, similar to that provided by Kanfer and Phillips’ Learning Foundations of 
Behavior Therapy. It is divided into three sections concerned with biological psychology, with the 
psychology of learning and with social psychology and the psychology of decision making respectively, with 
an additional section on theoretical, methodological and ethical issues which, although the three chapters in 
it are interesting, does not, strictly speaking, come within the scope of the title. 

This is a very valuable book. The four chapters in the section on the psychology of learning, ‘Classical 
conditioning’ by Ian M. Evans, ‘Instrumental learning: comparative studies’ by Morrie Baum, ‘Operant 
conditioning and clinical psychology’ by Glyn V. Thomas and Derek E. Blackman and ‘The experimental 
analysis of retarded behaviour and its relation to normal development’ by James Hogg provide admirable 
guided tours through some of the recent literature. A similar function is performed by the section on social 
psychology and decision making, which also consists of four chapters, on ‘Observational learning and 
therapeutic modelling’ by S. J. Rachman, ‘Social psychology and the behaviour therapies’ by the first editor, 
‘Applications of the psychology of decisions’ by the second editor and ‘Behavioural assessment and 
decision-making: the design of strategies for therapeutic behaviour change’ by Richard I. Lanyon and 
Barbara J. Lanyon. Such chapters furnish the practising behaviour therapist with an excellent refresher 
course. There are, inevitably, imperfections. Some chapters could have been improved if the editors had 
taken firmer control of their contributors. For example the first chapter in the section on biological 
psychology, ‘The behavioural inhibition system: a possible substrate for anxiety’, by Jeffrey A. Gray, begins 
interestingly enough with an account of the author’s alternative to Eysenck’s personality theory, but then 
wanders off into a discussion of the effects of various drugs on the theta rhythm of the rat, a topic whose 
implications for behaviour therapy are, to say the least, not made clear. Again, the second chapter in this 
section, ‘Psychophysiological vanables’ by P. H. Venables begins with the suggestion that 
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psychophysiological techniques may be useful (1) as diagnostic tools, (2) as indicators of change, (3) in 
bio-feedback techniques, but is then restricted to the discussion of electrodermal and cardiovascular activity 
and the measurement of emotion, largely ignoring the suggested uses. Further, the sections on biological 
psychology and the psychology of learning suffer in places from an excessive complexity which defeats the 
bridge-building objective of the book: if it is to reach behaviour therapists who are not specialists in these 
other fields, then all, and not just much, of it should be comprehensible to them. However, these criticisms 
are outweighed by the overall merits of the book. 

Paradoxically, its most interesting feature is that in a sense the book undermines the presuppositions 
suggested by its title, namely that there exist fairly well-established bodies of theory and experimental 
findings which, if they are made available to the practitioner of behaviour therapy, will enable him/her to 
treat patients more effectively. Chapter after chapter shows how little psychology has in the way of 
satisfactory theory, and how confused and mutually contradictory the experimental findings are. Doubts 
about the applicability of this work are expressed openly by Ian M. Evans, who writes ‘One cannot help but 
feel that the direct value of these studies for behaviour therapists has declined. For years, behaviour 
therapists have relied on a simple, textbook kind of account of classical conditioning, and it has fared them 
quite well'. That this should not worry us too much is part of the burden of the chapter by H. J. Eysenck 
(the only chapter which is actually unputdownable) whose succinct formulation cannot be improved: 'the 
hard sciences do not proceed by an easy hypothetico-deductive path towards facts involved in pure and 
applied research and. . .their theories are temporary, constantly in need of patching up and constantly being 
upset by empirical findings that do not fit into the existing theoretical network. So of course, are ours; this is 
not a sign of weakness, but of strength’. 

J. P. N. PHILLIPS 


Motion Sickness. By J. T. Reason and J. J. Brand. London: Academic Press. 1975. Pp. x+310. £9.40. 


This closely follows the pattern of the motion sickness section of the first author’s previous book Man in 
Motion. Whereas the former was in non-technical language for the traveller, this has the full academic 
treatment with sources formally cited. The fascinating history of bizarre treatments is sketched, followed by 
a review of the extent of the problem created by motion sickness. From there the authors proceed to analyse 
the phenomenon more closely and then examine the role of the vestibular system. In general, normal persons 
are likely to suffer motion sickness to some degree when the information signaled by their vestibular system 
fails to bear the normal internat consistency or conflicts with visual information. Such conflicts can occur in 
a variety of ways but perhaps the most striking are revealed by moving the head when travelling in a vehicle 
which is rotating. Nodding yields a strong turning sensation while rocking sideways yields a pitching 
sensation and both are accompanied by symptoms of sickness if repeated persistently. Adaptation will occur 
naturally. On a sea voyage it normally takes about three days. It can be hastened by deliberately 
exacerbating the symptoms. Much research on adaptation has been promoted by the space flight agencies. 
As a result, astronauts who show signs of sickness are instructed to follow a regimen of rapid head 
movements. On the other hand, the traveller can attempt to minimize symptoms by lying on his back and 
keeping his head still. He can take anti-motion sickness drugs, of course, but these will only postpone the 
onset of symptoms. If you are on a prolonged journey and have a job to do, adaptation 1s the only answer. 
Research on motion sickness is not very extensive. Reason has done much of it himself and it is a 
pleasure to see a research project carried through to a point where useful advice can be dispensed and then 
reported in such a readable form. There are many loose ends, of course. Those which seem most prominent 
concern psychological factors. My main reservation regarding this research is that it is almost a complete 
sell-out to the neuro-physiologists. The unanswered questions discussed in the concluding chapter emphasize 
this direction, for they concern the specification of the underlying neural structures, yet we must all have 
personal experiences which suggest the importance of psychological factors. For myself, I can vividly 
remember being shown by a keen major just how the Army pilots carried out artillery spotting in fixed wing 
aircraft. They would fly along behind a line of trees, order the guns to fire and then climb rapidly a couple of 
hundred feet, hang in the air while the fall of the shells was observed and then dive for cover again. As 
passenger, I turned green quite rapidly. What is of interest, though, is a later experience when travelling in a 
train. I thought, ‘This is just like the low flying I was shown by that Army major’ whereupon I promptly felt 
sick. The odd report cited by Reason and Brand indicates that motion-sickness is maximized by instructions 
to attend to bodily symptoms and minimized by busy mental activity. Wartime investigators attempted to 
distinguish between the effect of sea-sickness upon battle performance and maintenance activities. The studies 
were not formal. However, they indicated that most of the men affected could perform their normal tasks 
satisfactorily when really needed, yet would be careless about less essential jobs such as stowing gear. These 
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investigators recognized the important role of volition. Sadly we have failed, in the meantime, to make any 
progress on such psychological issues as measuring mental effort or controlling our thoughts. If our more 
competent colleagues choose to pursue neurophysiological issues we never will. 

H C. A. DALE 


Human Stereopsis: A Psychophysical Analysis. By W Lawrence Gulick and Robert B. Lawson. London: 
Oxford University Press. 1976. Pp. xi--292. £11.75. 


After a brief Introduction, there is a pretty detailed account of the history of research on stereopsis. Starting 
with the ancients, we are led through the work of Leonardo, Wheatstone, Brewster, Panum, Helmholtz, Ogle 
and numerous lesser lights. Considering the chronological theme of this part of the book, the practice of 
sending the reader to the end of the chapter to find out who discovered what when (references in the text are 
only numbered) is especially distracting. However, the treatment is generally clear and provides a good 
background to the subject. 

The next chapter deals with retinal correspondence and the horopter. The discussion concentrates on the 
theoretical issues, rather than the many empirical studies which have been carried out over the years. This is 
no bad thing for, as the authors convincingly demonstrate, the notion of a geometrical horopter is not a very 
useful concept. 

The remainder of the book is mainly concerned with previously unpublished accounts of experiments 
carried out by the authors or their graduate students. Topics covered include contours, interposition, visual 
direction, size-distance constancy and many others. Throughout this there runs a plausible discussion that all 
but persuades the reader that everything has been taken care of, that every problem tackled has been 
adequately treated, and that all the results are final. And perhaps this is so, but there seem to be an awful 
lot of contradictions in this book. Figures are not as described in the text - perhaps a poor draughtsman. 
Data presented twice are changed — perhaps they are not supposed to be the same. A distribution is expected 
to be symmetrical, found to be asymmetrical, but taken to be symmetrical just the same — perhaps rightly. 
To take a particular example: on page 52 evidence is cited (some of it their own) and the assertion is made 
that ‘instead of contours giving rise to depth, it is rather depth that gives rise to contours’. Now this I found 
contentious, I had always thought that both arose from disparity, but I was prepared to be persuaded. By 
page 254, however, I found I had won the argument: ' both contour perception and stereoscopic depth arise 
directly from disparity’. 

It is difficult to know what to think about this part of the book. If only one could have had enough detail 
of the experimental work to feel sure that the ambiguities are not serious, then it would have added 
considerably to its value. For all their flaws, there may be something to be said for publication referees. 

This is certainly not a bad book, indeed the first part is certainly good. The rest lies somewhere between 
fair and very fair. For those in binocular research, this book is close to essential, but for anyone else its 
value will be weighed against its price. 

MARTIN CRAWSHAW 


Elements of a Two-process Theory of Learning. By J. A. Gray. London: Academic Press. 1975. Pp. x+423. 
£6.60. 


Dr Gray has written an interesting and entertaining book, the chief aim of which in his own words is to 
illustrate the ‘mysterious ways learning theorists think’. This is not a text book of animal learning, but rather 
a highly selective treatment of key issues, which will be particularly useful for discussion with 
undergraduates at an advanced level. At the same time, Gray’s definite stand on certain problems will make 
the book of interest to specialists, and it is certain to be quoted for some time to come as a standard 
treatment of the relationship between Paviovian (classical) and Thorndikian (instrumental) conditioning. Dr 
Gray is at his best in the treatment of habituation, where his knowledge of the Russian literature is very 
valuable; and in his systematic and original treatment of the Drive concept. He distinguishes various uses to 
which the Drive concept has been put (energizing; providing basis for reward, etc.) and considers in detail 
the extent to which these various aspects could, and do, correlate. The basic conclusion, that they correlate 
on the whole poorly, both logically and factually, still needs saying from time to time, and Gray has made an 
excellent job of sorting out the logical tangles in this area. 

Gray’s basic position in this book is that of ‘two-factor learning theory’, according to which Pavlovian and 
Thorndikian conditioning are two fundamentally different processes. Many people seem to find this 
dichotomy useful, so I shall probably be in a minority if I say that I find the logic behind this sort of 
distinction insufficiently thought out. Obviously, there are as many different kinds of learning as we like, if 
we care to concentrate on underlying mechanisms (contribution of the cortex, for example, to use one of 
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Gray's arguments). Visual and auditory learning presumably use very different brain mechanisms up to a 
certain point, but few theorists consider them different kinds of learning. The usually proposed operational 
distinctions need not add up to anything of importance for the animal either, as Gray himself admits. One 
operational distinction is that Pavlovian conditioning makes a reinforcer contingent upon a stimulus, while 
Thorndikian conditioning makes it contingent on a response. But why whould this distinction be thought 
more important than one between different classes of responses, or stimuli? A response can easily be 
treated, if we wish, as a discriminative stimulus that reliably predicts reinforcement for the animal. In 
general, there is no barrier to distinguishing as many kinds of learning as we like if we are ‘splitters’, or as 
few, if we are ‘lumpers’. What Gray and others have yet to do is to lay down the criteria for speciation in 
this area. One idea they seem to have is that Pavlovian and Thorndikian conditioning require different 
theoretical accounts. But since we don’t yet have an accepted theory of either, this distinction 1s a little 
difficult to maintain. Gray has a particular problem here, because although he presents a detailed theoretical 
account of Pavlovian conditioning (The Rescorla-Wagner model, in particular is well described) he never 
gets round to saying how he thinks Thorndikian conditioning works. At the point where the reader might 
reasonably have expected to find such a theory the book peters out into a treatment of special tppics such as 
the PRE and frustration. Thus the whole problem of response initiation is bypassed, and it is disappointing 
to see the neglect in this book of theorists such as Bindra, who have seriously had a go at this problem 
starting from a basis of Pavlovian conditioning. (Bindra's influential ‘Nebraska symposium" article is not 
even referenced.) Thus Gray is really in the position of distinguishing two classes of learning, one of which 
he can explain, and the other of which he can’t; and it is not clear that this is a valid basis for a fundamental 
dichotomy. 

The ‘Elements’ is very clearly written, and encourages the reader to think critically. Gray has been 
successful in explaining the ‘mysterious ways learning theorists think’, but not, thank God, in demonstrating 
the mysterious way in which they often write. ' 

MICHAEL MORGAN 


Human Memory: Theory and Data. By Bennet B. Murdock, Jr., Potomac, Md.: Lawrence Erlbaum. 1974. 
Pp. 362. £11.00. 


Many recent books on Human Memory have been written by ‘Cognitive psychologists’, that is, by a group of 
people who find that the intricate, and often pleasing speculations in which they indulge are hampered by 
consideration of experimental data of any kind. 

Professor Murdock’s book may be taken as a useful definition of ‘cognitive psychology’ by precise 
antithesis. He sets out to deal with both theories and data, and the organization of the book reflects this 
worthy aim. Alternate chapters deal with the theoretical basis of particular problems and with the ‘data’ 
with which adequate theories must cope. , 

It is clear that Murdock's heart is with the data rather than with the theories; he remarks in page 1x of his 
Preface: ‘To segregate theory and data is not a very satisfactory procedure. They should be thoroughly and 
carefully interrelated. However, we are not quite ready for that. On the one hand none of the theories we 
have now explain in any depth and with great precision even an appreciable fraction of the relevant data. On 
the other hand, to consider data alone would be barren and uninteresting’. 

One interesting thing about this book is that the data presented after each ‘theory’ chapter seem to have 
very little to do with the abstract considerations which introduce them. 

It should be made clear that this is a careful and scholarly book. In other words the fault is not with 
Professor Murdock. Is the fault, then, with the imprecision of theoretical statements or with the lack of 
ingenuity of their implementation in experimental studies? 

The gap between theory and data would, indeed, prove to be considerably larger if Murdock had not 
carefully chosen a particularly restricted view of ‘theories’. His theoretical chapters are concerned rather 
with the assumptions behind the computation of various performance indices, and with mathematical models 
for particular effects than with the carefree elaboration of ‘functional relationships’, ‘models’, ‘systems’ or 
‘processes’ in human memory, to which the last few years have accustomed us. 

When such ‘models’, etc., are discussed it is always in the context of choices between possible kinds of 
experimental paradigm which the use of such constructs may require. For example, chapter 2 is called 
‘Theories of item information.’ It presents a very clear account of ‘strength theory’, ‘threshold theory’ and, 
in general, of the use of signal detection statistics to handle confidence judgements and other performance 
indices relatively intractible by other means. The next theoretical chapter (4) is called ‘Theories of 
association’ and presents briefly and lucidly the background to the concepts of proactive and retroactive 
inhibition and interference, but then gratefully ‘takes off’ into a fine discussion of computational techniques 
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and relevant data for Bower's two-stage model, for the Suppes and Ginsberg ‘multi-state model’ and for 
Murdock’s own ‘fluctuation model’. 

The original contribution made by Professor Murdock in providing mathematical models of this kind is 
very important, and it is pleasant to have his work accessible in this form. Our present interest, however, 
should be in the kind of model that Murdock offers and prefers. Also with the partial view of the literature 
which it encourages. i 

For example it is striking that the entire, admittedly very tedious, literature on ‘Mediation and Imagery’ is 
set out in about four pages (130-134) as a subset of the data available on ‘associations’. 

The remaining theoretical chapters are chapter 6 on ‘Theories of serial order’, chapter 8 on ‘Theories of 
free recall’ and possibly chapter 10 on ‘Models of memory’. The titles are exact, and not accidently chosen. 
These are again ‘micro-theories’ concerned with the methodologies, experimental and mathematical, by 
means of which particular types of effect in the data can be studied (but not necessarily interpreted). 

Perhaps indeed we must concede after reading Murdock’s book that the literature on imagery is really only 
worth four pages of review. We may also have to concede that it may be right that the two chapters on ‘free 
recall’ should contain so little about models for propositional and semantic relationships in hypothetical 
memory systems, which currently form the basis for most discussions of ‘organizational factors’. There is a 
very high ratio of talk to data in the literature dealing with these factors. It is almost a negative emphasis 
that Collins and Quillian, Anderson and Bower, Rips et ai., Bransford and Franks and other progenitors of 
work in this field should not be accorded neglect, but rather strikingly brief mention (e.g. notes on pages 
214-269 and elsewhere). It is as if Murdock is emphasizing that he is aware of this work, and of some 
obligation to quote it, but that he feels it does not contribute anything very substantial to the themes he 
follows. 

It must be said that in spite of the limitations Professor Murdock has set himself, and even perhaps 
because of these limitations, this is a very good book. It can be used very profitably by undergraduates and 
research students to gain a firm grasp of the methodology of memory experiments, to Clearly understand the 
computational techniques involved and to begin their own literature searches with the benefit of a thoroughly 
adequate and up-to-date bibliography in this area. Those of us who have to lecture on these topics must 
remain indebted for some time for summaries of experimental work which are more concise, clear and 
complete than most of us can manage to produce for our own use. 3 

But does this book advance our understanding of human memory? And, if not, how is our understanding 
of human memory best advanced? 

Murdock has an underlying philosophy to which we must pay serious attention. The line taken by future 
work, as he sees things, should be: ‘By the standard ‘‘divide and conquer "' strategy of scientific 
investigation’ to ‘isolate systems and paradigms and bring them into the laboratory for experimental 
investigation’ (p. 2). 

Is memory really to be understood best by some future synthesis based upon the complete understanding 
of a very large (perhaps infinitely large) number of particular paradigms and subsystems? 

The question about this book is whether it may not represent too extreme a reaction against the admittedly 
irritating trends in ‘Memory and cognition’ which are currently in vogue. The study of memory seems to be 
pursued, at present, by two very different kinds of investigators. A first group occupy themselves by 
propounding generalizations which are usually, even in principal, untestable; which are usually based on 
fiendishly complicated sets of theoretical assumptions and which require much labour to understand - the 
more particulary because they are framed in very idiosyncratic terminologies. These generalizations are 
deliberately selected to be ‘interesting’, in the rather weak sense (for science) that they purport to offer an 
account of phenomena that are part of our everyday experience. (The literature on imagery is, for example, 
a demonstration of how dull experiments can come from ‘interesting’ topics of speculation and 
introspection.) 

Professor Murdock is a leading exponent of a second attitude, and is one of those investigators who 
patiently and successfully educate their readers in a number of quite difficult concepts, leaving them at last 
with a very precise grasp of a number of things which they did not wish to know, and which are of little 
practical use to them. 

All this does seem a very sad waste of a great deal of directed effort, of high intelligence and of pleasing 
originality. There must be a middle way? 

PATRICK RABBITT 
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Ordinary Ecstasy: Humanistic Psychology in Action. By John Rowan. London: Routledge & Kegan Paul. 
1976. Pp. v+234. £2.25. 


This is a swirl of a book. The contents go from Eastern mystics to 11 different ways to do research in 
psychology. Shuttling between accounts of levels of consciousness, how to change work situations, and the 
Assagioli egg diagram one appreciates John Rowan’s enthusiasms. He really seems to care about all of us. 
R. D. Laing comes out as less accepted than many might expect but overall, the book is glad about people. 
For example, our conflicts, we are told, need ‘serious pursuit’ ‘as an important road to wisdom’ 

(pp. 164/165). It is a bit unsure if this assertion parallels ‘Epidemics need serious pursuit as an important 
road to ending disease’ or ‘Health needs serious pursuit as an important road to happiness’. 

Quite probably posing such alternatives goes against the spirit in which Rowan wrote this book. He notes 
that he has written other books intended more for an academic readership and that Ordinary Ecstasy ‘1s 
about what happened and what is going on’ (p. 173) in, as I take it, humanistic psychology. Theoretical 
argument is not his present concern. Instead we have a welter of approaches, a jostling crowd of terms, and 
a bibliography that is the book’s most useful offering (for this reviewer). The 253 references all have a 
mention of what Rowan thinks of them and how humanistic he sees them as being. For Rowan humanistic is 
good. Of course, other views are heard and, for example, the Koch piece in ‘New perspectives on encounter 
groups’ is duly noted in the bibliography. There is an openness of, at the least, allusion through the whole 
book. This frankness, the optimism, and an assortment of quotations to cover most tastes carries one through 
the 186 pages of text. Afterwards and intending no puns, one asks: where has our high speed trip got us? 
One realizes that an increasing number of people seek to ask new questions or repeat very old ones; that 
they believe that they have the start of answers; and that they see change as necessary and psychology as 
relevant. Not all of which ıs new. 

Amongst the sprawl of jargon, vogue words, and neologisms are sentences that catch wisdom from 
disparate traditions and criticisms which indict sharply and with cause. There are, too, simplifications: think 
of that optimism over conflict and then read Rowan’s awareness of horrors and idiocies in modern life 
(chapter 14). In this century we have had 100000000 and more people die at the hands of others. How can 
anyone see conflict offering a road to wisdom? 

By definition, more, even some, ecstasy would be delightful. It is not easy to see the book under review as 
helping us that way. There is little doubt that a package tour of cultural centres in Western countries can be 
very enjoyable and even a television series on ‘Civilization’ can bring new themes, whatever its presenter’s 
personal biases. Similarly reviews, compressions, selections, and outlines have some use. We may 
acknowledge those suggestions and still accept that ecstasy ranges from beatific visions to the immersion of 
the self in some particular performance of a piece of music, or ceremony, to joys shared with someone else 
and quite beyond either account or denial. To write a book on the ordinary parts of such experiences is 
allowable oxymoron but a formidable task. To claim a substantial overlap between ecstasy and the 
excitements and topics that make up the bazaar of humanistic psychology shows a great confidence, if not a 
hint of arrogance. To write the book in a kaleidoscopic way may divert and even provoke. If we are lucky 
such diversion or provocation will lead to understanding. For this reviewer once the dust settled the appeal 
of smaller projects resumed. This resumption was taken up not because huge themes are absurd, nor because 
concepts like prejudice and alienation are ill-founded (albeit challenging of definition). The resumption comes 
from thinking that figures matter as well as ground and that clear problems appeal more than frenetic 
simplicity. (For examples of the latter see the top paragraph of page 29, lines 19-27 of page 11, and lines 7-9 
of page 131.) 

GODFREY J. HARRISON 


Experiments in Distant Influence: Discoveries by Russia's Foremost Parapsychologist. By L. L. Vasiliev (edited 
with an introduction by Anita Gregory). London: Wildwood House. 1976. Pp. xi--241. £6.95. 
In the West, parapsychology grew up in conscious opposition to the teachings of scientific materialism. In 
the Soviet Union the emphasis was always on minimizing the difference between parapsychology and the 
established sciences. This can be seen in the writings of the late L. L. Vasiliev, former Professor of 
Physiology at the University of Leningrad and holder of the Order of Lenin, who, more than anyone else, 
launched parapsychology in the Soviet Union. He had been interested in the problem of telepathy ever since, 
in the early 1920s, he joined the Leningrad Institute for Brain Research where Bekhterev was attempting to 
demonstrate telepathy in dogs by means of mental commands. Perhaps his interest began even earlier 
because we learn here from the invaluable introduction provided by Anita Gregory that, as a boy, he was 
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once nearly drowned while out playing only to hear, when his mother got back from a trip abroad, that she 
had had a vivid dream of the whole incident. At all events he continued working on the problem all through 
the 1930s but it was not until 1962 that his work became known when his university published his 
monograph. By then, thanks to the thaw, it had again become safe to publicize such unorthodox material. 
Then in the following year an English translation of the monograph was printed privately by Anita Gregory 
and her husband, the late C. C. L. Gregory, with Vasiliev's blessing. Vasiliev died in 1966 and his book, 
which by now has become a parapsychological classic, has at last been made widely available in this smart 
new edition. 

Although Vasiliev experimented with various tests of telepathy of the traditional kind, using words or 
pictures as targets, he preferred the induction of motor movements by remote suggestion and, in the 
standard set-up which he eventually developed, the suggestions are restricted to falling asleep or waking up 
as the case may be. The subject had to squeeze a bulb to show that he was awake so that a kymographic 
record could be kept of the precise times when the suggestions took effect. Vasiliev was fortunate in finding 
two women who performed especially well in these conditions. Having satisfied himself that he now had a 
more or less dependable set-up, he addressed himself to what he regarded as the crucial question, namely, is 
telepathy based on radiation from some band of the electromagnetic spectrum, as the Italian physicist, 
Cazzamalli had suggested and even claimed to have demonstrated. He proceeded, therefore, to put his 
subjects inside Faraday cages that were impervious to such radiation only to discover, to his great surprise, 
that there was no falling off in the rate of success. He even continued to get positive results in his later 
long-distance experiments when the subject was in Sebastopol and the sender in Leningrad! He therefore 
concluded that telepathy could not, after all, be based on electromagnetic radiation — unless, conceivably, on 
the very low frequency range which the cages did not exclude, but this, he thought, most improbable. 
However, he did not abandon the hope that physical basis for the phenomenon would some day be found. 

Perhaps the first thing a potential reader will want to ask is whether these findings are really as valid as 
Vasiliev himself believed. It must be said that his design of experiment is not one that easily lends itself to 
straightforward statistical evaluation. The subject, especially when placed in an unventilated chamber, may 
easily drop off without the benefit of suggestion, telepathic or otherwise, and, equally, he may well wake up 
of his own accord. Yet it is only in his later improved experiments that Vasiliev bothered to do a control run 
and compare the response latency when a suggestion was given with that which occurred when there was no 
suggestion. But, in these experiments, at least, there seems no doubt even from a cursory inspection that the 
respective time intervals do differ significantly. Nevertheless, as always in parapsychology, some doubts 
must remain until others have confirmed these findings and so far no one has reported a successful 
replication using a comparable set-up. 

So where does all this leave us? Not only do we still lack a repeatable method of eliciting ESP but we are 
no nearer an agreed explanation of it. The radiation hypothesis still has its devotees, recently the Italian 
physiologist, Guarino, has been canvassing the case for what he calls ‘thermodynamic radiation’, while, in 
this country, the mathematician, John Taylor, has been championing low frequency electromagnetic radiation 
as the answer. Most parapsychologists, however, doubt whether ESP could be any sort of energetic 
phenomenon. Yet, Vasiliev's career was not in vain. His moral courage and doggedness set an example 
which has helped Soviet parapsychology to survive the recurrent political pressures to which it has since 
been exposed. It seems the authorities there have never been able to decide whether to suppress it, and 
so risk losing out to the West on something that may prove to be of value, or whether to allow it and risk 
unleashing an occult explosion. In Leningrad, G. A. Sergeev, another physiologist, has been carrying on the 
Vasiliev tradition and has been investigating Mme Kulagina, Russia's number one PK subject (she moves 
small objects at a distance) who was originally a protégée of Vasiliev. Soviet parapsychology has now 
reached a point where it is exerting a definite influence in the West and as can be seen from the popularity 
among researchers of the new psychophysiological approach to psychic phenomena. 

JOHN BELOFF 


Criminality and Psychiatric Disorders. By S. B. Guze. London: Oxford University Press. 1976. Pp. 181. 
£7.50. 


Follow-up studies are to the criminologist what the post-mortem examination is to the physician - the 
ultimate revealers of truth, because follow-up studies are difficult and expensive they are rare. For these 
reasons alone this eight- to nine-year follow-up study of 223 male and 66 female felons (what we would call 
indictable offenders) demands attention. 

~- If in addition the methodology is ship-shape, with all the terms and diagnoses clearly defined, and the 
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collection of the samples, in prison, clinic and hospital, well described, then that attention becomes 


The object of this 15-year study was to determine the relationship between serious crime and psychiatric 
diagnoses. The principal subjects were male and female parolees, supplemented by smaller samples from a 
forensic psychiatry clinic (representing the 2 per cent of felons who are diverted from the penal to the 
hospital system), and a third study starting with 500 hospital patients amongst whom 22 felons were 
identifled. 

The results of these studies and a review of the literature are consistent. Sociopathy (diagnosed by the 
presence of any two of the following criteria: history of violence, school delinquency, poor job record, 
period of wandering, being a runaway), alcoholism and drug dependence are the psychiatric disorders 
characteristically associated with serious crime, while schizophrenia, primary affective disorders, anxiety 
neurosis, obsessional neurosis, phobic neurosis, and brain syndromes are not. Unless associated with 
sociopathy, alcoholism, or drug dependence, illegal sexual deviation is not associated with serious crime. 

So it depends what you mean by psychiatric disorder. If you confine it to schizophrenia, affective 
psychosis, epilepsy, mental deficiency, organic brain disease, then only 3 or 4 per cent of felons are 
included. But 85 per cent can be labelled as sociopathy, alcoholism or drug dependence, and if the neuroses 
(12 per cent) and homosexuality (1 per cent) are added then 90 per cent of felons have one or more 
psychiatric diagnoses. IIlness-related crimes occur but ‘they are infrequent and therefore, contribute little to 
the overall association between crime and psychiatric illness’. A similar picture emerges for the women, with 
very high rates of sociopathy, alcoholism and drug dependence. The prevalence of hysteria was higher than 
amongst male felons, and more than 20 times that in the general population. Sociopathy or hysteria were 
found in 80 per cent of female felons, 26 per cent with both, indicating 'a significant association between 
hysteria and sociopathy ' and suggesting that ‘at least some cases of hysteria and sociopathy show a common 
etiology '. 

This link between hysteria and sociopathy was recognized by Lee Robins (1966) who questioned the usual 
interpretation of conversion symptoms as unconscious defences against anxiety as in the neuroses; she 
pointed out that hysteria appears more closely related to sociopathy: of her 20 women followed-up after 30 
years, 'the childhood history and background certainly resembles that of sociopathy, as does the early adult 
history. ..'. 

The wives and relatives were interviewed revealing that convicted felons characteristically come from 
severely disordered families and social backgrounds. Poverty, broken homes, parental criminality and 
alcoholism are ‘nearly always present’, though ‘many first degree relatives. . .exposed to the same adverse 
social and family circumstances, are free of delinquency and criminality’. 

Serious crime, then, takes its place amongst a constellation of social ineffectiveness and abuse of alcohol 
and drugs. As the author points out *. . .unfortunately, available treatments for these conditions are still not 
dependable in most cases’, so that current psychiatric treatment is unlikely to be very effective. In seeking 
an alternative solution he seizes upon the striking reduction in recidivism associated with increasing age, 
and, perhaps rather naively suggests that ‘individuals convicted of serious crime, or those with a history of 
previous felony convictions, could be imprisoned until they reach an age at which the risk of recidivism is 
markedly reduced, which, on the basis of the studies reported here, is approximately age 40’. He goes on to 
state that by combining age and extent of previous criminality, it is possible to identify groups of convicted 
felons with strikingly different risks of recidivism. Unhappily, as we know from the Baxstrom and other 
studies, while it is possible to identify the more dangerous offenders, the process is overinclusive, so that 
many less dangerous persons would be mislabelled. The total effect of the author’s solution would result in 
very long sentences for a large number of felons (who are on the whole quite a young group), as well as the 
addition of the falsely labelled additions. American prisons may not be quite so full as ours but even so they 
could hardly stand this population explosion. 

Perhaps the feature of this valuable and careful research which may appeal most to psychologists is the 
confirmation of the association between sociopathy and hysteria ‘. . . Parallel sociological, psychological, 
clinical, and biological investigations should be carried out for both conditions. The possibility that the same 
causal factors will result in different clinical pictures, depending on the sex of the individual, raises many 
interesting questions. What are the important biological and cultural factors contributing to the sex 
differences. To what extent will changes in the status of women affect this sex difference? Can the sex 
difference be used to identify the most important features of culture and family life related to the 
pathogenesis of the two conditions?’ s 
P. D. SCOTT Fie 
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The Application of Social Psychology to Clinical Practice. By S. S. Brehm. New York: Halsted. 1976. 

Pp. xiv+277. £10.80. 
A less misleading title might be 'Some Theoretical Applications of Social Psychology in Clinical 
Psychology '. There is no attempt to discuss applications in wider clinical practice than clinical psychology. 
Similarly there is no attempt to describe applications that have actually occurred with clinical populations. 

The author's aim is to provide for clinical psychologists alternative models, or ' general perspective(s) 
when approaching another human being’, to the two currently favoured models of psychoanalysis and 
learning theory. 

Three main theories are presented, reactance theory, dissonance theory and theories of attribution. Each is 
described in some detail and similarities to other related theories are indicated; then some possible 
applications of the theories to clinical practice are discussed, using evidence from experimental, non-clinical 
studies. In addition, other theories, commitment, objective self-awareness and self-expressive decision 
making are briefly discussed. 

For the main theories, the book gives enough information to give some understanding of the theory and a 
feeling for the type of clinical situation in which the author believes it might usefully be applied. However, 
one is left with some reservations about applying the theories. 

First, choice of theory is not adequately discussed and the author gives little guidance about or even 
comparison of different clinical procedures which would be followed using different theories. No attempt is 
made to compare the three theories presented and only minimal attempts at comparison with learning and 
psychoanalytic approaches. Even where comparisons are made with the learning theory analysis, the 
comparison is made with learning theory rather than with demonstrated clinical applications of learning 
approaches. This would appear to be an unfortunate omission as the author admits that difficulties may arise 
in application that are not anticipated from the theoretical presentation. Indeed, she is critical of some 
experiments on dissonance theory, suggesting that the principles of dissonance theory have been 
misinterpreted. Given such difficulties, plus one of the author's rare comparative statements that dissonance 
and operant conditioning approaches lead to theoretical but not practical differences, one would be wise to 
avoid dissonance theory and use the better validated operant approaches. 

The author tends to sell the theories, using one-sided rather than two- or multiple-sided presentations, so 
that the data on which the reader might make his own comparisons are not available. For example, on the 
interesting and important issue of the number and pattern of rewards leading to greatest resistance to 
extinction, only one reference (Lawrence & Festinger, 1962) is offered, ignoring the mass of other 
experimental data, not to mention the various attempts to explain the data. 

These points hint at the main difficulty of the book: that the author is trying to suggest different theoretical 
models to the learning and psychoanalytic models without a clear indication that they are in any way better. 
In her concluding chapter, she makes it clear.how she would make such a judgement, presenting her basic 
belief that clinical psychology is a form of social engineering where techniques are used because they work, 
rather than because they fit some theoretical design. The models she presented have not been demonstrated 
to work with clinical populations or even in individual case studies of patients, and where possible one 
should presumably continue to work with models of demonstrated utility. On this basis, the section on 
attribution theories is of most interest to clinicians because here at least the models have demonstrated 
therapeutic utility within normal populations. Otherwise, the book presents some interesting ideas which 
might be investigated empirically with patients for whom alternative treatment approaches have not been 
successful. 

M. JOHNSTON 


On the Psychology of Military Incompetence. By Norman F. Dixon. London: Jonathan Cape. Pp. 448. £6.95. 
Dr Dixon, an officer in the Royal Engineers before he became a psychologist, has written a book using his 
second professional identity to illuminate what must have aroused his passionate concern in his first. The first 
half consists of historical thumbnail sketches of military disasters involving the gratuitous loss of many 
thousands of lives as the result of the incompetence of generals during the last hundred years or so of 
British military enterprises. This part which makes fascinating and horrifying reading is based on general 
historical and military historical sources. Throughout the book Dixon carefully points out that British military 
history does not, after all, consist of disasters only, and that controversy persists in the evaluation of some 
generals’ performances. While I am not competent to judge the rights and wrongs of such controversies, the 
episodes he presents ring only too horribly true. 
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This very well-written book is obviously meant first and foremost for military readers. Dixon prepares 
them skilfully for the concepts and terminology ot what is to follow by highlighting psychological features in 
his disaster-bent generals. Whether he can carry along those who share or admire such features is another 
question. 

The second part, devoted to discussion and ‘explanation’, concerns psychologists most. A wide range of 
psycbological concepts and theories are introduced and continuously exemplified by aspects of military 
behaviour. Dixon is an eclectic; he draws on information theory, ethology and behaviouristic theories within 
a psychoanalytically oriented framework. Contrary to traditional assumptions Dixon shows that military 
ineptitude is the result not of inadequacy of intellect but of personality: ego weakness and authoritarianism. 
This personality syndrome is now widely accepted since its central features have been confirmed by 
empirical studies conducted within a variety of theoretical approaches: it has probably much to do with 
incompetence in all walks of life, not only in the armed forces. But Dixon's social psychological analysis of 
militarism reveals an awful paradox: military organizations are so set up as to be in peacetime particularly 
attractive to inadequate personalities who feel protected in their basic insecurity by the requirement for blind 
obedience, drill, honour above all, and repression of independent thought and initiative. Promotion to the 
highest level is likely to be given to those who excel in these qualities which, in time of war, are 
prescriptions for disaster. : 

The book ends with a chapter entitled ‘Retreat’; not from the psychological conclusions based on an 
analysis of past events, but retreat from suggesting ways in which military incompetence could be avoided in 
the future. Indeed, Dixon points out that the complex technology of modern warfare, which inevitably 
separates a general from his men by the interpositions of many layers of staff, may interfere with the 
intentions even of competent leaders. The only relief from this fundamental pessimism is Dixon’s analysis of 
the life histories of a few generals who found themselves in command not because but in spite of their 
non-authoritarian personalities. It would be safer for all of us to be able to rely on disarmament than on the 
chance selection of competent and humane generals. 

Given Dixon’s eclecticism which combines psychoanalysis with Eysenck, molar with molecular approaches 
(to mention only two bedfellows who do not often sleep together in the psychological literature) it is no 
wonder that no tight theory of military incompetence emerges. While even his loose theory is occasionally 
tenable thanks only to procrustean efforts, for psychologists the most relevant feature of the book is its 
combination of various theoretical orientations to a specific problem. Many theoretical controversies which 
plague psychology are here implicitly revealed as unnecessary, for each theory illuminates a different aspect 
of military behaviour. These theories need not, indeed they cannot, be compatible because they address 
different questions and different phenomena. Not that psychology should progress without controversy, let 
alone without tighter theories than Dixon’s, but his successful use of various theories for various purposes 
suggests that ‘some of our fights are against windmills. 

Again by implication the book contains a challenge for social psychologists: the development of concepts 
and measuring devices to deal with the fit between organizational and personal attributes and its 
consequences. This issue is central to Dixon’s thesis; but unfortunately it has not found expression in the 
psychological principles he lists as the basis for his explanations (p. 24). Because he has nowhere dealt 
systematically with the concept of fit, he evades the question of how the promotion of competence comes 
about in military organizations. How did Wavell, Alexander or Slim achieve command? The idea of a fit 
between an organization and an individual is more complex than Dixon tacitly assumes. But it would be 
unfair to blame the author of an original and interesting book in applied psychology for not having solved all 
the conceptual issues which his subject raises. 

MARIE JAHODA 


Psychological Testing: The Measurement of Intelligence, Ability and Personality. By Paul Kline. London: 

Malaby Press. 1976. Pp. 168. £4.75. 

This book has been written, presumably, for that now infamous organism, the educated layman. It does not 
purport to be a standard work on assessment for professional psychologists, and will not replace any of the 
more usual sources to which practising clinicians will go for information. How, then, does the book seem to 
a psychologist who does not find it at all difficult, to put himself in the place of the educated layman - at 
least so far as psychological testing is concerned? 

I found three major criticisms. First, many psychologists would question some of the assumptions of 
Kline's approach, which are often put forward as accepted psychological doctrine. In particular, I was not 
happy with the distinction he makes between fluid and crystallized ability, even though Cattell might believe 
it to be a real one. The point is not clarified by a baffling error occurring on page 15. "Thus we can say that 
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fluid ability is largely innate or inherited but that crystallized ability is still largely determined by heredity’. 
Neither would all psychologists accept that there is any real basis for an arbitrary division of human 
behaviour into ‘intelligence, aptitude, personality, motivation and interest’. 

My second major criticism is that he claims to be describing the Skinnerian approach to personality. In 
short, he is mistaken. He has unfortunately misunderstood the main point of radical behaviourism when he 
states that behaviourism '. . .ignores entirely the feelings of the patient which is, of course, because 
Skinner is interested in behaviour and not in feelings or verbal reports of feelings’ (p. 41). It is time that this 
tiresome myth was finally laid low in British psychology. B. F. Skinner, one of the few remaining 
Skinnerians, has, since 1945, concerned himself with the analysis of feelings, thinking, reasoning, etc. In 1963 
he wrote ‘Behaviourism at Fifty’, a paper entirely about private events, which he claimed was ‘A 
re-statement of radical behaviourism’ and in 1974, in ‘About behaviourism’ he said that he behaviourist 
account of ‘mental life’ was ‘. . .the heart of radical behaviourism’ (p. 212). 

Finally, many will not be altogether happy with Kline’s discussion of the social implications of 
psychological testing. At the present time, scientists are being forced to become more aware of the fact that 
they are operating within a changing society, and I would be interested to hear the comments of politically 
aware psychologists when they read that ‘It is important not to overemphasize the social implications of the 
use of psychological tests’ (p. 145). 

Notwithstanding these criticisms, I believe that Kline has written a useful and eminently readable book. 
His style, reflecting at times his background as a classicist, is witty and amusing. He leaves the reader with 
the impression that psychology must provide the Monty Python team with much of its material. I could not 
resist: ‘Finally on one occasion around the monthly peak of our subject’s sex drive, she went out and 
bought her father a present, a jar of pickled eels. She was unused to buying her parents presents and had 
never bought eels before ~ such perfect Freudian symbolism needs no pointing out’ (p. 91). I wonder if it 
was significant that the eels were pickled? 

The book points to some of the current controversies in psychology, and discusses clearly some of the 
uses and abuses of the more commonly used tests. It should help to dispel many of the false impressions the 
educated layman might have about psychology and psychologists, and is recommended for this. The price - 
which is steep for such a small book — might, however, prevent it from reaching the appropriate audience. 
CHRIS CULLEN 


Personal Goals and Work Design. Edited by Peter Warr. London and New York: Wiley. 1976. Pp. xiv+264. 

(Price not given.) : 

In 1974 a conference on psychological and other approaches to the study of life at work was held at York, 
England, under the auspices of the Scientific Affairs Division of NATO. This is the book of the conference. 
As Warr points out in his Introduction, the volume is not just a string of conference presentations. Most of 
the speakers rewrote their contributions to take account of the comments and discussions which their papers 
gave rise to. 

The conference was planned to look at the quality of life at work from three main standpoints: issues of 
understanding, value and change. This pattern has been held to in the book. Thus chapters 1-14 cover a 
variety of topics from the three standpoints, some contributors ‘thinking big’, some ‘thinking small’ (to use 
Warr's classification), and the final part of the book - Chapters 15, 16, 17 — is an attempt to pull together the 
salient points from the three standpoints. Considering the amorphous nature of the general theme of the 
conference, one could certainly claim a measure of success for the book from this point of view. There is, 
however, a single, tantalizing, recurring theme running through the volume which is not faced squarely until 
the final chapters. This is how change qua change should be handled — or even envisaged. Critics of the 
book will immediately react with: But how could anyone expect anything in the way of objective 
consideration of change under such an umbrella? And indeed, one of the conference contributors (Strien, 
chapter 8), voices misgivings of just this kind. To quote, ‘Quite a few psychologists would question whether 
NATO is the right channel through which the affiliated governments should sponsor scientific exchange, and 
in this respect, I must confess I had to overcome certain scruples in coming to this conference.’ Reading the 
first 14 chapters of the book one gets the impression that the contributors were being careful to keep clear of 
any treatment of change which might be labelled radical. However, in the last three chapters there is more 
openness. For instance Elliott, chapter 16; ‘Is minor relief from unpleasant work good in itself, even though 
this may divert attention from the need for fundamental change? Can some worthwhile changes be made that 
do not require a shift of power?’ And Pettigrew, chapter 17: ‘Those conference papers which ultimately 
question conventional wisdom in the fields of personal goals and work design, clearly signify that changes will 
be required beyond the boundaries of task and technology. . .’. Warr — - ilf, in his Introduction (p. xi), 
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touches briefly on fundamental aspects of change: ‘Should we try to institute small changes, perhaps to 
reduce dissatisfaction of a few employees, or is that too narrow or superficial an approach? Should we try to 
alter a whole system, perhaps attempting to influence the initial design or total operation of production 
systems, or is that impracticable and overly idealistic?’ In this connection, however, it could be said that one 
of the great merits of the book — and of the conference - is that, despite reticences, the fundamental 
importance of this basic consideration, change, clearly emerges. 

From a general point of view, the book is striking for its variety of dimension within its overall pattern. 
For example, approach - psychological, sociological, etc.; ethical vs. technological and organizational 
innovation; macroscopic (‘thinking big’) vs. microscopic (‘thinking small’); tough-minded vs. tender-minded 
thinking; diversity of ‘home ground’ of the contributors - USA, UK, Canada, New Zealand, Belgium, 
Holland, West Germany. The contributors themselves include a range of well-known figures - Fred Fiedler, 
Louis Davies, Pieter Drenth, Albert Cherns; Peter Warr, the Editor, is also a contributor. 

A main aim of the conference was to investigate how far work can be designed to meet both 
organizational rquirements and the personal goals of employees. From the book it can be seen that the 
venture — conference and book - has illuminated some crucial issues in this complex area. 

JAMES G. M'COMISKY 
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An exceptional memory 


Ian M. L. Hunter 





An account is given of the exceptional memory of the late Professor A. C. Aitken who was also a 
distinguished mathematician and mental calculator. Examples are given of his performance in memory-span 
tasks and his recall of artificial materials learnt under experimental conditions 27 years earlier. His memory 
is stronger than average in all respects but normal in mode of functioning. It relates to his comprehending 
materials in terms of rich patterns of multiple, often recondite, properties. Compared with Shereshevskii, 
another man with exceptional memory, he shows the scholar’s reliance on conceptual mapping rather than 
the mnemonist’s reliance on perceptual chaining. 





Professor Alexander Craig Aitken, FRS (1895-1967) was a man of far-outstanding intellect. He 
was a brilliant mathematician (Whittaker & Bartlett, 1968) who had ‘in large measure the kind of 
mystical insight into problems which characterized, for example, Isaac Newton’ (Collar, 1967). 
He was a uniquely able mental calculator (Aitken, 1954; Hunter, 1962, 1965, 1966, 1968). He was 
an accomplished violinist. He was also legendary for his memory. The purpose of this paper is 
to give an account of his exceptional memory. 

To say that a man has exceptional memory is like saying he has exceptional athletic or artistic 
ability: it only roughly delineates his prowess. Thus, exceptional memory can rightly be claimed 
for the Russian, Shereshevskii (Luria, 1969), the American, V.P. (Hunt & Love, 1972), and 
Aitken; yet each man has a different pattern of memorial talent integral with a different style of 
mental life. So, in what sense did Aitken have exceptional memory? Briefly, he was unusually 
erudite with a scholar's disposition to become absorbed by, and retentive of, things relating to 
his spheres of erudition. He could readily produce, out of his head, much detailed information 
and could rapidly learn new information that interested him. His memory was (and this was also 
his own view) exceptional in degree rather than in kind. 


Overview of Aitken's memory 


Aitken could produce a host of recondite facts about numbers, calculative methods, mathematics 
and mathematicians; play, on the violin, many pieces by heart; recall many musical 
compositions; securely identify many snatches of music heard or seen in written notation; quote 
extensively from English literature; and recite tracts of Latin and English verse. He could recall 
details of many events he had witnessed, so much so that committees often consulted him as an 
unofficial minute book. In daily affairs, he was conspicuously, but not officiously, precise about 
names, dates, locations. The following excerpt from his reminiscences about the First World 
War illustrates his characteristic precision and his recall of details that would elude most people 
(the platoon mentioned would comprise 39 men). On 14 July, 1916, he was in France, lying in a 
dug-out trying to sleep. Sleep 
proved impossible; each time I closed my eyes I heard again, as though it were in the dug-out itself, the 
whistle of the falling mortar-bombs, and I saw Hughes, Robertson, Sergeant Bree, Harper, and the line of 
trees. But gradually, through and across these repercussions, I became aware of a conversation in low 
tones going on somewhere behind me, apparently between Captain Hargest and Mr Rae, and perhaps 
occasionally someone else — but I am not sure of this. However that may be, something was missing; a 
roll-book; the roll-book of Platoon 10, my old Platoon. Urgently required, it seemed; Batallion had rung 
up, requesting a list of the night's casualties and a full state of the Platoon. Apparently surnames were 
available, but the book was nowhere to be found. This being suddenly clear, I had no difficulty, having a 
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well-trained memory now brought by stress into a condition almost of hypermnesia, in bringing the lost 

roll-book before me, almost, as it were, floating; I imagined it either taken away by Mr Johnston or 

perhaps in the pocket of Sergeant Bree in no-man's-land. Speaking from the matting I offered to dictate 

the details; full name, regimental number, and the rest; they were taken down, by whom I do not know 

[Aitken, 1963, pp. 107-108]. 

Many stories are told about the range, tenacity, rapidity and precision of Aitken's seemingly 
effortless memory. Two typical, and reliable, stories relate to the early 1920s. 


He taught me Classics at Otago Boys' High School and he used to amuse us at our lessons by 
demonstrating how he could associate line numbers in our Virgil with the words in the line or conversely 
could recite the words in any specified line [Personal communication from Dr Harold Taylor, sometime 
Vice-Chancellor of Keele University]. 

As a young teacher, a single reading of the names and initials of a new class of 35 boys enabled him never 
to consult the lists again [Hudson, 1967]. 


His memory, however, had limits. He did not have ‘‘total recall” if, by this, is meant some 
mythical ability to recall absolutely anything he had ever experienced; nor was he always able to 
recall, on the instant, things he could recall on other occasions. To illustrate, in 1960-1 he 
attempted to recall some words and numbers he had learnt, under experimental conditions, in 
1932 (see below): he recalled a lot, but by no means all, and made remarks such as ‘the others 
are not recoverable although in an extreme state, such as insomnia, they might come back' and 
‘I felt this must be wrong, hence I decided to do what I often do, to ‘‘wait for illumination ”', 
and not to hurry the process'. His knowledge, though great, was not encyclopaedic, e.g. he 
knew little about sports, and even with regard to music, where he knew a great deal, he 
remarked that the musical knowledge of Professor Tovey, of Edinburgh University, made him 
‘feel like a prattling child’. (Examples of Tovey's extraordinary musical memory are given by 
Grierson, 1952.) Finally, he did have a memory span. Sutherland (1937) reports assessing 
Aitken’s memory span in 1932 by asking him to repeat back sequences of items presented at a 
rate of two items per second. With auditory presentation of sequences of random letters, the 
span was 10: with auditory presentation of random digits, 13: and with visual presentation of 
random digits, 15. Evidence is given below about Aitken’s memory span in 1961. 


The evidence and the problem 


The present paper draws on four main sources of evidence about Aitken. First, my own contacts 
with him. Between Dec. 1960 and Dec. 1961 we talked much together, tape-recorded interviews, 
exchanged letters. Our dialogue focused on mental calculation, but the topic of memory arose. 
Secondly, publications by Aitken (1954, 1962, 1963) that allude to memory. Thirdly, the 
unpublished transcript of a 1954 radio broadcast entitled ‘Faster than thought’. This was an 
unscripted discussion among Aitken, Wim Klein (Dutch mental calculator), Cyril Burt 
(psychologist) and B. V. Bowden (computer scientist). Fourthly, an unpublished manuscript 
(Sutherland, 1937) reporting Aitken’s performances in experimental tasks given in 1932-3 by Dr 
John Sutherland and Dr Boris Semeonoff, both of Edinburgh University. During 1960-1 Aitken 
recalled some details of these experiments and in 1962, when I received Sutherland’s manuscript, 
I was able to check the accuracy of what Aitken had recalled. 

This sprawling mass of-evidence is anecdotally interesting, not to say awe-inspiring. It firmly 
dispells any notion which might be gathered from Luria (1969) that someone who remembers 
detail is necessarily unable to work creatively with general ideas. But it poses a difficult problem 
for psychology. How are we to progress beyond anecdote, fragmentary observation, and poetic 
metaphor, so as to make, from the evidence, something both faithful and of general 
psychological interest? This problem can be approached in several ways, but the approach 
adopted here is as follows. Surveying the evidence as comprehensively as possible, let us ask: is 
there any one single feature which, more than any other, recurs throughout the evidence and 
holds it together? 
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It turns out that the single feature which most permeates the evidence is ''interest in 
meaning” — a passionate interest in patterns that interrelate the deep and the surface properties 
of things. Bartlett’s (1932) phrase, ‘‘effort after meaning", would serve as an apt label, with the 
qualification that the ‘‘effort’’ is undertaken out of spontaneous interest and the '' meaning" 
has a multilevel, Virgilian quality of penetrating an event so as to discern many interconnected 
aspects of it, and relate it to other events, and yet do little to violate the uniqueness of the event. 


Interest in meaning 


Interest in meaning, in the sense just mentioned, is illustrated by the following excerpts from a 
letter, dated 30 Jan. 1932, sent by Aitken to Dr Sutherland. They show that interest in discerning 
multiple properties is a support for, and is supported by abilities to learn and remember. 


Of musical memory and intensive practice in developing this faculty, this dates almost completely from, 
1912, age 17, one might almost say from after the war, when I devoted myself to the violin, and to the 
endeavour to learn how to read from score and how to compose, very seriously. Musical memory can, I 
am certain, be developed to a more remarkable degree than any other, for we have a metre and a rhythm, 
a tune, or more than one, the harmony, the instrumental colour, a particular emotion or sequence of 
emotion, a meaning, however difficult to express in other terms, in the executant an auditory, a rhythmic 
and a muscular and functional memory; and secondarily in my case, a visual image of the page which 
comes to the rescue when all the rest flag; perhaps also a human interest in the composer, with whom one 
may identify oneself for the time that the composition is being heard or performed; and an aesthetic 
interest in the form of the piece. They are so many, and they are so cumulative, that the development of 
musical memory, and appreciation, has a multitude of supports. 

Recurring decimals interested me as a boy; they have lost much of their glamour now. Át school I once 
had to work out as an imposition the decimal for 1/97, a task which I accomplished fairly quickly by 
finding to my surprise that when nine or ten digits had been obtained the rest could be derived from them 
by dividing by two and continuing. This cheerful fact, of very simple explanation, so pleased me that I 
remembered the 96 digits of 1/97 ever afterwards, not by mentally working them out, but by pure memory 
of an auditory or rhythmic kind. It has the property, shared by the recurring decimals of fractions with 
denominators 7, 17, 19, etc., that any other fraction with the same denominator but a different numerator 
has the same digits in the recurring period in a cyclic order. [For example, in the decimal below, the 
decimal for 48/97 starts at 0-4948453608. . .Once Aitken had discovered some of these properties of 
recurring decimals, he was on the lookout for them in new contexts.] To elucidate my remark about 
auditory-rhythmic memory of the number, I shall bar it musically (the bars being of unequal time-length) 
and put in the accents: 


0 [10308 2 |783505154 6 |391 7 [5257731 9 |587628865979381 4 | 
432 9 |896907216 4 |9484536082474 2 |268041237 1 |13400206185567 || 


Here every second digit receives an accent, and the phrases are of varied length, as in music, or in 
breath-groups in singing. 

In the Sixth form I remembered by heart whole books of Virgil and of '' Paradise Lost”, of which I am 
still able to recall longish passages, and such facts as, for example, that in the Eighth Book of Virgil, there 
is a spondaic line fairly early, ‘‘Pallantis proavi de nomine Pallanteum”’, another at line 167 (I say this 
offhand, but feel sure of it) ‘‘Discedens chlamydemque auro dedit intertextam”’, and at line 596 the 
well-known onomatopoeic line, ‘‘Quadrupedante putrem sonitu quatit ungula campum". In those days I 
used to read the Georgics and other parts of Virgil than those we were studying at the time, from pure 
pleasure. [The above details from the Aeneid are all, in fact, correct.] 


Discernment of multiple properties 

Aitken's memory was intimately linked with his ability to discern multiple properties that were 
interwoven into distinctive patterns. His discernment could work rapidly to produce an unusually 
rich, densely structured gestalt of properties; and so many things, that would seem chaotic to a 
bystander, were, to him, embodiments of multiple properties that meshed into an interesting, 
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"The ease with which he learnt and remembered anything, and indeed whether he learnt and 
remembered it at all, depended squarely on the meaning and interest it had for him. Thus, 
whenever something interested him deeply, he was typicaily able, later, to recall many details of 
it despite his having had not the slightest conscious intention of committing anything to memory. 
Again, if he were given material that, for him, had little meaning (say, a random string of digits), 
he typically pronounced it *uninteresting' or even 'repellent'. If asked to commit such material 
to memory, he might oblige if he thought some psychological value might emerge from the 
exercise, but usually remarked that the exercise was ‘unnatural’ and ‘went against the grain’. 
(Throughout this paper, all words bounded by single quotation marks are Aitken's, unless 
otherwise specified.) If he did undertake ‘unnatural’ memorizing, he adopted a characteristic 
approach as follows. 

When given material that was ‘not too repellent’ and asked to memorize it, he did not, as 
might be expected, go tense in concentration. He went noticeably still and relaxed. When asked 
about this curious behaviour, he explained that he was using a subterfuge (' assimilation by 
interest’’) on which, he discovered years ago, he could rely. He was relaxing by way of 
preparing to find interest in the material or ‘to let the properties of the material reveal 
themselves'. He felt best able to secure memorization by refraining from deliberate 
interpretation and organization; rather, he cleared his mind and relinquished the job to his 
vast cognitive system, allowing it to work largely autonomously and in whatever way came most 
naturally. He commented as follows. 


I discovered that the further I proceeded, the more I needed relaxation, not concentration as ordinarily 
understood. One must be relaxed, yet possessed, in order to do this well. Sometimes one enters a 
bookshop and there, displayed on the stalls, are various interesting books. One selects, dips, reads, 
becomes intent, until the stage is reached when all surroundings are forgotten. Afterwards, one leaves the 
shop and enters the street again, blinking at the light and at the people as if one had come out of an 
anaesthetic. And so it is here. The one requisite is that a live interest in the subject should fix an 
undeviating attention. . .Interest is the thing. Interest focuses the attention. At first one might have to 
concentrate, but as soon as possible one should relax. Very few people do that. Unfortunately, it is not 
taught at school where knowledge is acquired by rote, by learning by heart, sometimes against the grain. 
The thing to do is to learn by heart, not because one has to, but because one loves the thing and is 
interested in it. Then one has moved away from concentration to relaxation. 


Psychobiography 
Aitken's mnemic activities were inextricably part of a larger configuration including all his 
psychological processes, e.g. his general knowledge, preferred pursuits, emotional and 
intellectual attitudes to the world and to himself: his memory was non-detachable from his entire 
psychophysical make-up, from ‘the participation of the whole personality’. Of his personality, it 
might fairly be said that he was, above all, a reflective man who sought to comprehend events by 
discerning their inward patterns. With his talents, this mild-mannered man might well have made 
more of a worldly splash, but it would have ‘gone against the grain’ to pursue, say, political 
power or success in business. It was supremely consistent that he found his eventual 
professional calling in a field of scholarship devoted to uncovering deep mathematical patterns. 
"Interest in meaning" was a leitmotiv of his psychobiography. Starting in his mid-teens and 
continuing into his late-twenties, he was enthralled by number, music, poetry and his own 
intellectual capabilities. These were the years, involving ‘a kind of mental Yoga’, in which he 
became demonstrably exceptional through attempting to penetrate the inwardness of numerical 
relations, the architecture of poetry and music, the system of his own mental skills, the 
interconnectedness of phenomena at large. Almost certainly, he was interested, at this time, in 
testing his mental powers to the limit, e.g. discovering how much Milton he could recite 
verbatim or how fast he could calculate. Almost certainly, too, he enjoyed demonstrating his 
prowess to others. But it is equally certain that, in later adult years, he rarely memorized 
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anything merely for the sake of memorizing it. He would deliberately memorize what he thought 
might be useful to have readily available in his head, e.g. the names of his students. More often, 
things got memorized as an unintended by-product of his penetrating interest in them. 

His experiences during the First World War affected him deeply and added a further 
dimension to his interest in meaning. The war not only removed him from the tranquil 
surroundings of his youth but plunged him into a harrowing lottery of death and destruction. He 
was haunted by these experiences ever afterwards and by their sheer lack of meaning. He tried 
laying them to rest by formulating them in words and, as late as 1963, he published his literary 
account of them. It is probably not too wide of the mark to say that the war impelled him to a 
heightened interest in the ongoing flow of his personal experiences as something to be 
contemplated and searched for transcendent meaning. However that may be, he returned from 
the war to complete his degree work in New Zealand, develop his interest in music, teach Classics 
for a time, and then go to Edinburgh to pursue mathematics. 

By the time I came to know him in 1960-61, he was a distinguished scholar who had received 
many academic honours and who took no greater interest than was necessary in conventionally 
practical affairs. His interests were in ideas and in ways of understanding things. One of the 
things he was interested to understand was the markedly unconventional mind he had built over 
his lifetime. ‘The weight of experience can at times be a burden’ because these experiences had, 
to some extent, a life of their own: they often presented him with sudden, unexpected solutions 
to mathematical problems; deprived him of sleep; and seemed to demand that they be organized 
into harmoniously coherent patterns. He was intrigued by mental processes in general and, in 
particular, by the seemingly mysterious ways of his own mental processes. It was this aspect of 
his widespread intellectual curiosity that made him so willing to participate in any constructive 
attempt to understand high-level psychological functioning. 

To conclude this section, it is clearly misleading to draw a firm line around certain of Aitken’s 
activities, call them ‘‘memory’’, view them as an isolated ‘‘faculty’’, and attempt to understand 
them without going outside the circle we have drawn. Aitken’s memory was one aspect of an 
immensely complex, many-sided, highly-integrated system of mutually supporting activities 
cumulatively developed over the years. In fanciful language, Mnemosyne is mother of the 
Muses, but mother and daughters work together as a family team. 


Three experiments 


This section reports three experiments in which Aitken learned and remembered what he called 
‘unnatural material under unnatural conditions’. 


Memory span for digits 


On 23 May, 1961, Aitken and I tape-recorded a two-hour interview devoted mainly to mental calculation but 
including a brief venture into memory span. The transcript of our dialogue runs as follows. 
IH: The conventional procedure is to read the digits out at one a second. 
AA: I prefer about five a second. 
IH: TIl give twelve digits, reading out very slowly. 2, 2, 1, 7, 6, 8, 6, 5, 8, 4, 6, 8 [the rate is one digit 
every 1-5 sec]. 
AA: 2, 2, 1, 7, 6, 8, 6, 5, 8, now I lose it. I know it ends in 6, 8. It's too slow for me. Try again. I must 
accommodate myself. 
m: 1, 9, 3, 6, 2, 7, 5, 9, 4, 6, 1, 3 [one digit every 1-5 sec]. 
AA: [Repeats the twelve digits perfectly in a total time of 3 sec]. It begins to seem that I could acquire the 
skill. After a bit, I could get it alright. 
What would this skill be? 
The skill would be in slackening my pace down to a speed to which I'm not accustomed. Like 
learning to ride a bicycle slowly. 
Are you interpreting these numbers in any way? 
No, not at all. Try another twelve. 


EE 


EE 
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1H: 9, 3, 2, 2, 5, 3, 6, 4, 3, 9, 0, 7 [one digit every 1-5 sec]. 
AA: [Repeats perfectly in a total time of 3 sec]. 
IH: You couldn't give those backwards? 
AA: No, I couldn't. Just now, that would be 7, 0, 9, 3, 4, 6, I lose it for a moment. I know it goes back to 
2, 2, 3, 9, but there's a gap of two digits at that moment that I'm not quite sure about. With practice 
I could acquire the skill. If it were ten digits read out in two fives rapidly, then it would be easy. 

1H: 1,0,2, 7,5. 3,9, 6, 2, 3 [first five digits in total time of 1 sec, pause of under 1 sec, last five digits in 
1 sec]. 

AA: [Repeats perfectly in a total time of 2 sec with no pause. Then repeats backwards in a total time of 
six seconds with a brief pause midway]. 

When doing it backwards, do you have to recite it forwards first? 

No. In curious way I have it both forwards and backwards. It's a kind of shimmer. It's as if it had 
left an impression. It's this funny faculty of neither seeing nor hearing. It's the kind of thing I can't 
describe. It's starting to go already, by the way [this is 30 sec after completing the backwards 
recital]. It's fading at once simply because you asked me to do it backwards, and that obliterates the 
forwards. I know it's 1, 0, 2, 7, 5 and then it's starting to go. I can't get the rest. 

IH: How about fifteen digits in groups of five read rapidly? 

AA: Yes. Here's the rhythm [demonstrates the rhythm by tapping]. 

IH: 2,8,4,1, 5 0,6, 1,8,8 6,4,8,5, 2. [rate is exactly five digits per second with two pauses of 

1 sec each]. 

AA: [Repeats all 15 digits accurately in a total time of 3 sec, no pauses]. You need to be in something like 

perfect athletic form. 

The dialogue continued by my asking whether, at some future time, he would be able to recall any of these 
digit sequences. He was confident not. So what would happen if I gave him a sequence to learn so that he 
could recall it, say, tomorrow? In that case, he would have to adopt a different attitude; let the sequence 
‘soak in’ so that he could grasp its auditory-rhythmic properties and perhaps some of the numerical 
properties. He could do it, but it ‘would go against the grain’. 

Notice that, in the memory span task, he did not "interpret" the material and could not recall it after a 
delay of a minute or so. This supports the conclusion that his durable learning depended on discerning 
properties in the material. This conclusion is also supported by his comment about needing a different 
approach for durable learning. Notice too, his references, not merely to the skilled nature of the 
performance involved, but also to the need to adjust performance to make it appropriate for the particular 
requirements of the task. 


EE 


A list of 25 unrelated words 


Sutherland (1937) reports that one of the tasks undertaken by Aitken in 1932-3 was to memorize the 
following list of 25 words. HEAD, GREEN, WATER, SING, DEAD, LONG, SHIP, MAKE, WOMAN, FRIENDLY, 
BAKE, ASK, COLD, STALK, DANCE, VILLAGE, POND, SICK, PRIDE, BRING, INK, ANGRY, NEEDLE, SWIM, GO. A 
trial consisted of Sutherland’s reading out the list at one word per second, and then Aitken's trying to recite 
the words in sequence. The proceedings may be summarized as follows. 


Trial 1. 12 correct. Recalls first 12 words in sequence, nothing more. 

Trial 2. 14 correct. Recalls first 4, omits next 3, recalls MAKE-to-POND but substitutes SICK for STALK. 
Trial 3. 23 correct. Omits SING but adds it at end of list. Omits PRIDE and BRING. 

Trial 4. All correct. 

Trial 5. (1 week after trial 4). All correct. 

Trial 6. (3 months after trial 5). 22 correct. Omits DEAD, LONG and SHIP. 

Trial 7. (Immediately after trial 6). All correct. 

Trial 8. (15 months after trial 7). All correct, except that ANGRY is advanced two places. 


Twenty-seven years later, on 6 Dec. 1960, Aitken mentioned to me that he might try to recall something of 
the tasks done in 1932-3. On 10 Dec. 1960, he sent me a letter about some of these tasks. The following 
refers to the word list. 


When I left you, I thought of the list of words which Dr Sutherland gave me to memorize. Well, I 
thought about this list as I walked up Chambers Street from the Staff Club, and at once got part of the list. 
..., MAKE, WOMAN, FRIENDLY, BAKE, ASK, COLD, STALK, .. .I realized that this was in the middle, and 
must be preceded by perhaps seven words and followed by ten words or so. At the top of Chambers 
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Street I had three of the immediately preceding words - as I thought. ..., DEAD, LONG, SHIP, MAKE, 
WOMAN, etc. But I had a lingering idea that it might be HEAD, LONG, SHIP, etc. ‘Dead’ or ‘head’ thought 
I? Why this doubt, quite apart from similarity of sound? At that very moment the focus shifted towards 
the other end, and I had this. ..., ASK, COLD, STALK, DANCE, VILLAGE, POND, SICK, ..., ANGRY NEEDLE, 
SWIM, GO. 

And in the sound of the last three I recognized the end of Dr Sutherland’s list. I took a bus now in 
George IV Bridge and went to the top deck. No sooner seated, I had that late gap filled, thus. . ., DANCE, 
VILLAGE, POND, SICK, PRIDE, BRING, INK, ANGRY, NEEDLE, SWIM, GO. As the bus went down the Mound, 
passing New College, I suddenly got the beginning, and saw why there had been that hesitation between 
‘dead’ and ‘head’. The first word in the list was ‘head’!! Now I had it. HEAD, GREEN, WATER, SING, 
DEAD, LONG, SHIP, //MAKE, WOMAN, FRIENDLY, BAKE, ASK, COLD, STALK//DANCE, VILLAGE, POND, SICK, 
[[PRIDE, BRING, INK, ANGRY, NEEDLE, SWIM, GO. Notice the caesurae which I have marked, //. And 
feeling this at once to be 7 plus 7 plus 4 plus 7 = 25, and remembering from 1933 that almost my first 
overall observation was that there were 25 words, I felt the complete certainty, which I have at this 
moment, that I had the list, and in the proper order. I do not doubt it. 


This experiment spans 28 years, shows remarkable long-term retention, and calls for three comments. 
First, Aitken did not learn or recall the list as a strict chain in which the first word leads to the second, the 
second to the third, and so on to the end of the list. In 1932, his learning trials showed omissions that were 
inconsistent with a strictly cumulative chain. In 1960, he did not recall the list by retrieving the first word, 
then the second, and so on to the last word. His recall showed that he used an overall structure, in the form 
of a rhythmic pattern, to hold the words together in sequence. 

Secondly, although he did not, in 1960, recall the list in my presence, I have complete confidence that his 
account is fastidiously truthful. He assured me he had no written records of Dr Sutherland’s materials and, 
to the best of his knowledge, had not recalled these materials since 1933. Furthermore, his recall of the list, 
although impressive, was ‘‘characteristically Aitken”: on 7 Dec. 1961, when I unexpectedly mentioned this 
list, he wrote it out, there and then, to illustrate its property of rhythmic grouping. Thirdly, in the entire 
evidence about Aitken, I have never found any instance where he turned out to be wrong after he had 
expressed confidence in the correctness of his recall. 


A list of 16 three-digit numbers 


Another task done in 1932 was to memorize this list. 194, 503, 876, 327, 714, 961, 583, 259, 487, 364, 950, 
613, 294, 437, 182, 659. A trial consisted of presenting the numbers visually, in the window of a memory 
drum, at 2 sec intervals; after the whole list had been shown, Aitken attempted to recall the numbers in 
sequence. The proceedings may be summarized as follows. 


Trial 1. 6 correct. Recalls first 3 numbers, then 487, 437, 659. 

Trial 2. 10 correct. Recalls first 7 in sequence, then last 3 in sequence. 

Trial 3. 14 correct. First 7 in sequence, 487-to-613 in sequence, last 3 in sequence. 

Trial 4. All correct. 

Trial 5. (48 hours later, without fresh presentation). All correct except that 327 is omitted. 
Trial 6. (Immediately after trial 5 and after one presentation). All correct. 


Four years after trial 6, with no further presentations but some attempts at recall in the period just before 
reporting his recall, he recalls 12 correct and four wrong numbers. The sequence is jumbled, as follows 

(correct numbers italicized by me). 194, 327, 503, 364, 487, 613, 576, 583, 961, 294, 437, 421, 874, 750, 259, 
659. 
On 14 Dec. 1960, Aitken sent me a letter which included the following (correct numbers italicized by me) 


A journey to London, and return, shook me out of my Edinburgh groove enough to set even more of 
those tests with Dr Sutherland vibrating again. I suddenly began to have an impression of various numbers 
of three digits he had given me to look at and, after an interval of some days, to reproduce. I enclose what 
I have recovered of this list. Some seem more familiar than others. Thus 294, 961, 327, 259, 576, 583 rouse 
in me more conviction than others. Here is the list. 194, 294, 327, 482, 953, 961, 659, 367 (feels doubtful, it 
may be 364), 613, 259, 874, 437, 583, 576, 487 (I notice that it is a palindrome of 874, another in the list 
which, however, may have been 876), 503, 421 (which may not have appeared at all). There are 17 
numbers of three digits in the above list. This seems too many; but I do not recall whether Dr Sutherland 
gave me 16 or 20. 
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This experiment shows that Aitken could forget and that, as with the word list, he did not construct a 
strictly cumulative chain. In this experiment, however, there is no clue as to how he gave overall structure to 
the list. Whatever that structure might have been, it was not recalled in 1936; nor in 1936 did he recall all the 
items correctly and in their original sequence. In 1960, he remembered, about the list as a whole, only that it 
had either 16 or 20 items; and although he recalled more than half of the items correctly, he did so without 
confidence and without any attempt to place them in their original sequence. 


Aitken and Shereshevskii 


Consider Aitken's memory alongside that of people in general. The comparison reminds us how 
greatly individuals differ with regard to what they retain from their past for use in their present 
pursuits. Experiments on memorization, for example, show striking differences between children 
of different ages (Brown, 1975; Hunter, 1976) and people from different cultures (Cole & 
Scribner, 1974). Again, people from different historical-cultural backgrounds, and from different 
walks of life, vary conspicuously in the extent to which they retain things in their head or have 
recourse to external recording devices: and autobiographies show that people differ, at different 
times in their life, in how they view and value their personal past experiences (Herbert, 1974). 

A comparison between Aitken and the ‘‘average’’ adult shows that his accomplishments of 
memory are, at every turn, stronger than average. He has a longer memory span, a more 
retentive grip on things he has learnt and, overall, a larger and more finely articulated cognitive 
system. At the same time, nothing about Aitken violates what we know about the “‘ normal” 
design features of memory. It is normal, for example, that high-level learning and remembering 
depends on interest and, crucially, on the process of comprehending material in terms of 
patterns of multiple properties. Even his subterfuge of ''assimilation by interest” is within the 
experience of many highly intelligent people. For example, when experienced actors deliberately 
set out to master a new role, they do not focus on the task of memorization as such but, rather, 
on studying the role with a view to discerning a network of meanings in it: this pursuit of 
meaning has the effect of securing memorization as a by-product (Smirnov, 1973, chapter 3). 

Now, compare the present account of Aitken with Luria's (1969) account of S. V. 
Shereshevskii (hereafter called‘‘S.’’). The comparison shows how different is the exceptional 
memory of a scholar and a mnemonist. 

S. was an outstanding professional mnemonist, that is, someone who memorizes haphazard 
strings of items. He used the classical mnemonist's technique of imagining richly vivid, concrete 
mental-pictures, which he arranged in a chain of pairs. If given the 25-word list mentioned 
above, he might take an imaginary walk along a street that has a vivid succession of landmarks. 
He would represent the first word by a distinctively imaged picture which he would ‘‘locate”’ on 
the first landmark; the second word would be another mental picture ‘‘located’’ on the second 
landmark, and so on. During a single, and not too rapid, presentation of the word list, he would 
progressively devise such a chain of images that was extremely rich in perception-like properties. 
This would result in accurate, durable memorization. Even years afterwards, he would be able to 
recapitulate the chain of images, and, so, recall the list of words in either forward or backward 
sequence. Contrast this procedure with the way Aitken memorized the list, not by a strict chain, 
but by a kind of overall melody. 

The chief similarity between Aitken and S. is that each comprehends materials in terms of a 
mutiplicity of unconventional properties, knits these properties into fairly unconventional 
patterns, and durably retains these patterns so as to be able to reconstruct the original materials. 
The chief difference is the kind of property involved and the kind of pattern woven. 
Characteristically, S.'s kind of property is perception-like, i.e. particular sensory qualities and 
particular imaged objects: his kind of pattern is the chain, i.e. short-run links between succes- 
sively encountered items. Characteristically, Aitken's kind of property is conceptual: his kind of 
pattern is the panorama or map, i.e. long-run groupings of items into overlapping multilayered 
configurations. 
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The following comparison illustrates the different kind of property by which the two men 
comprehend materials. First, remarks made by S. in 1936. ‘Even numbers remind me of images. 
Take the number 1. This is a proud, well-built man; 2 is a high-spirited woman; 3 is a gloomy 
person (why, I don’t know); 6 a man with a swollen foot; 7 a man with a mustache; 8 a very 
stout woman — a sack within a sack. As for the number 87, what I see is a fat woman and a man 
twirling his mustache' (Luria, 1969, p. 31). Now, remarks made by Aitken in 1932. Sutherland 
(1937) showed him a mixed series of written words and numbers, and asked him to report what 
each item called immediately to mind. Here is the response to ‘‘7’’. ‘The line of poetry “They 
passed the pleiades and the planets seven’’ — mysteries in the minds of the ancients - Sabbath or 
seventh day — religious observance of Sunday - 7 in contrast with 13 and with 3 in 
superstition — 7 as a recurring decimal 142857 which, multiplied by 123456, gives the same 
numbers in cyclic order - a poem on numbers by Binyon, seen in a review lately — I could quote 
from it.' 

In broad terms, then, Aitken comprehends materials in terms of rich conceptual maps; S. in 
terms of rich perceptual chains. These contrasting modes of comprehension show in the 
intellectual profiles of the two men. S.'s great distinction is in memorizing haphazard strings of 
items; Aitken's is in mathematical thinking. In an Olympic Games for Mental Prowess, S. would 
win the gold in the section for mnemonists, while Aitken would scarcely qualify for entry. But 
S. would not even be considered for entry to three sections where Aitken would shine, namely 
for theoretical mathematicians, mental calculators, and all-rounders. 

These contrasting modes of comprehension have general implications that are best illustrated 
by reference to mnemonics. Mnemonics are mental contrivances that enable us to impose ad hoc 
meaning on material that is otherwise meaningless to us; and they typically involve our 
comprehending the material in terms of perceptual chains (Hunter, 1964, 1977; Norman, 1969). 
Now, mnemonics are demonstrably effective in securing memorization, and most people are glad 
of their help on occasion; yet since the time of classical Greece, serious students of memory 
have, to say the least, been reluctant to prescribe their widespread use. Why? Because they 
involve us in focusing upon the kind of property, and kind of pattern, that has severely limited 
utility for productive thinking. 

The disadvantage of mnemonics, as a widespread mode of comprehension, is shown by S. He 
has some characteristic difficulties in understanding, and thinking about, things which most 
people readily comprehend in conventionally meaningful terms. A pithy example is given on 
pages 129-130 of Luria's book. Again, when S. memorizes a mathematical formula (p. 49), he 
comprehends it in terms that have no mathematical meaning; and so nothing he learns can be 
turned to account in solving any mathematical problem. The way he comprehends the formula is 
a caricuture of the school child who, lured by short-term gains, embarks on a vicious cycle of 
“rote learning” based on mathematically superficial properties and patterns. 

By contrast, if Aitken were asked to memorize a meaningless formula, of the kind which S. 
would unhesitatingly commit to memory, he would almost certainly refuse on the grounds that 
the task is ‘too repellent’. It is completely characteristic of Aitken’s mode of comprehending 
things, and of his attitude to knowledge in general, that he should comment as follows. 
‘Mnemonics I never use and deeply distrust. They introduce an alien perturbation into a mind 
that should, as I would have it, be pure and limpid.' 
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Children's perceptual organization of seriated displays: Evidence against a 
memory reorganization hypothesis 


Richard F. Cromer 


The memory for a seriated display and its reorganization over an eight-month interval was examined in 
educationally subnormal children. By including groups of children who viewed a random display and an 
array of disordered sticks, it was found that the reorganization into a more seriated drawing after the 
passage of time was not directly based on the original stimulus. Various controls including copying of the 
original material, and matching and recognition conditions, give evidence that the child's cognitive level 
affects encoding of the material as well as its later output The observed phenomenon of ‘memory’ 
improvement may have little to do with stored images, and may instead be linked to the developmental 
symbolic level of the child which influences original perception as well as ' memory '. It was also found that 
educationally subnormal children perform like normal children of the same mental age on the seriation task. 


In the 1940s Piaget and Szeminska (Piaget, 1965; see also, Inhelder & Piaget, 1964) studied the 
development of the ability to place a number of sticks in a seriated order from the largest to the 
smallest. It was found that when enough sticks were used so that the child was prevented from 
merely making a perceptual adjustment to an intuitive whole, the child was unable systematically 
to seriate the sticks until about 7 or 8 years of age. It is not until that age, according to Piaget, 
that the child has achieved true operational reversibility so that he is simultaneously aware that 
any given stick is both smaller than the preceding sticks and larger than those which follow it. 
More recently, Piaget & Inhelder (1973) have extended the work on seriation ability to include 

, effects on long-term memory (see also Inhelder, 1969). Young children aged 3-9 years were 
shown a seriated set of ten sticks and were asked to reproduce, i.e. draw, what they had seen 
both after an interval of one week and again eight months later. More than 70 per cent of the 
children showed evidence of improved memory after eight months, i.e. their second drawing 
from memory was better than their first drawing in that it was more organized in a seriated 
fashion. To many people, these results seem surprising since the usual expectation is for a 
general deterioration to have occurred. Piaget, however, interpreted the results in terms of a 
theory of memory reorganization which owes its origin to Bartlett (1932). According to Piaget's 
hypothesis, some types of information which lend themselves to particular conceptualizations at 
one stage of development, will change over time as more advanced stages of development are 
reached. Thus, as the child advances in his operational seriation ability, the actual memory of a 
seriated array he has seen and encoded will also advance. This occurs because, according to 
Piaget, the memory is actually a representative symbol of a conceptual schema which has 
undergone development (Piaget & Inhelder, 1973). 

Several recent experiments (Altemeyer, Fulton & Berney, 1969; Dahlem, 1969; Blackstock & 
King, 1973; Finkel & Crowley, 1973) have confirmed Piaget's results to varying degrees, while 
controlling various factors which could have been responsible for the results (see Carey, 1971 
for a critical review). Altemeyer et al. (1969), for example included a group of children who 
were shown a randomly ordered array of the differently sized sticks. Surprisingly these children 
also drew more orderly arrangements of sticks as their memory-drawing months later. These 
results, however, were also interpreted in terms of memory reorganization. Two possibilities 
were envisaged along the lines originally noted by Bresson (1970), and these represent the ways 
several followers of Piaget alternatively understand his theory. In what they called the ‘content’ 
explanation, Altemeyer et al. hypothesized that the actual content of the memory changed over 
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time. Thus, the image of the random sticks became more organized into an orderly array. By 
contrast, a second explanation, the ‘process’ explanation, assumes that as central processes 
develop children become more able to conceptualize various arrays as patterned arrays. In other 
words, the content of the memory, such as ‘a bunch of sticks’, remains the same, but its 
decoding changes over time. Altemeyer et al. felt that either of these explanations could account 
for their finding, that children who had seen randomly arranged sticks also produced drawings of 
more seriated arrays after the interval of several months. Both suggested explanations, however, 
begin with the assumption of a process of memory reorganization based on some earlier stored 
image or memory trace. In one, the memory image itself undergoes changes; in the other, the 
manner in which a memory trace is decoded changes developmentally. Both of these commonly 
advocated interpretations of Piaget's theory however, pay little attention to encoding processes, 
and presuppose that during the original perception, the display is merely seen and recorded. It 
may be, however, that even the original perception itself involves elaborate constructive 
processes that are linked to cognitive development. 

The following experiment was designed to explore the entire phenomenon in greater depth. It 
was hoped that one could learn more about the encoding processes by presenting both seriated 
and non-seriated stimuli and by including conditions in which children copied the array while 
viewing it, and in which some children were asked to match the stimulus to several patterns. 
Further information concerning memory was to be gained by including recognition tasks for 
some groups rather than recall through drawing. A second aim of the experiment was to 
ascertain whether educationally subnormal children behave in a seriation task in the same 
manner as children of normal intelligence, and to see whether the memory changes originally 
reported by Piaget also occur in these children. It was hoped that a number of specific questions 
could be answered by this experiment, and these are listed below with their rationale: 

1. Will children draw more adequately seriated displays as their memory image over the 
passage of time during their cognitive development? This is a replication of Piaget's original 
results extended here to subnormal children. 

2. Will children draw these more seriated patterns even when shown a non-seriated stimulus? 
This is a replication of the Altemeyer et al. findings extended to subnormal children, and with an 
additional important control (see 3). 

3. In most of the previous studies of this problem, including those originally carried out by 
Piaget and by Altemeyer et al., children have been given a test of their seriation ability at the 
time that they first viewed the stimulus materials to be stored for memory. Such a test, while 
necessary to establish the original operational level of the child, nevertheless contaminates the 
memory results by providing a pressure for the child to see or act on sticks of various sizes in 
an orderly, seriated manner. In the present experiment, will children draw more seriated displays 
when there has been no such pressure? (see Method section). 

4. It can be argued that the ‘memory improvement’ results in previous experiments have been 
due to improvement over time in drawing ability. In order to control for this, this experiment 
asks whether similar results will be obtained in recognition rather than recall. A matching task at 
the time of first viewing the stimulus also controls for this variable (see 5). 

5. Will a child's presumed operational level affect his encoding of the stimulus? Piaget has 
included some reports of children directly copying the seriated display. This experiment attempts 
to note some features of encoding both through copying and by having one group match the 
stimulus to an identical stimulus on a display board. 

6. Can the improvement of seriation drawings be accounted for by the influence of a good 
Gestalt which causes children to draw more seriated displays over time? It should be noted that 
if this explanation is true, children of all operational levels should draw increasingly seriated 
patterns. The alternative, Piagetian hypothesis would predict that compared to original copies, 
the drawings of non-operational children should deteriorate, and only those of operational 
children should improve. 
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(a) (b) (c) 
Qn 


(d) (e) 
Figure 1. Arrays of sticks as presented on the large display board for matching and recognition. 


Method 
Subjects 


Originally, 84 educationally subnormal children were tested. By the end of the experiment eight months 
later, some children had moved away or were persistently absent, so that a total of 67 children completed all 
stages in the study. A child is classified as educationally subnormal when he is not able to keep up with 
normal schooling and is about two years behind in his school subjects. The chronological ages of the 67 
children ranged from 7:8 to 12:9, with a median age of 10:0. Their mental ages, as ascertained from the 
Peabody Picture Vocabulary Test, ranged from 3:3 to 9:8, with a median of 5:9, while their IQs ranged 
from 44 to 83, with a median of 65. 


Materials 


Ten, 6 mm dowling rods, varying in equal steps from 9--16 cm in length, were used. For one display, they 
were mounted on a black 20 cm square card in a seriated fashion from the longest on the left to the shortest 
on the right. A second set of similar sticks was mounted on a similar card, but in a random rather than a 
seriated fashion. A third, large display was used for the matching and recognition tasks, consisting of a 
board 64-5x42-7 cm on which were mounted six display cards, 20 cm square each containing ten sticks (see 
Fig. 1). One of these consisted of the seriated display described above (b); another was the random display 
(e); two displays were what Piaget found to be ‘primitive’ patterns, with sticks grouped merely into ‘long 
ones and short ones’, (a) and (f); another pattern was an intermediate type, the ‘arrow’ pattern (c); and 
finally, one display consisted of a mirror reversal of the target seriated pattern (d). 

In addition, an unmounted set of the ten differently sized sticks was used with one of the control groups, 
and to present the standard Piagetian seriation task to all subjects at the end of the experiment. For the 
latter task, nine additional sticks were used, intermediate in size between the ten sticks of the seriation task 


Procedure 


The children were randomly assigned to seven groups. The median mental age of each group of children 
completing the entire experiment was 5:7 for five of the groups, and 6:1 for the remaining two groups All 
children were tested individually. Each child was first given the Peabody Picture Vocabulary Test in order to 
ascertain his mental age and IQ. Then, children in six of the seven groups were shown a test stimulus and 
asked to describe it. There were two test stimuli ~ a seriated pattern corresponding to Fig. 1(b), and a 
random pattern corresponding to Fig. 1(e). The groups viewing one or other of these two stimuli were each 
subdivided into a recall group, a matching group, and a recognition group. All groups made a copy of the 
stimulus while viewing it. Children in the matching groups, in addition, were shown the board with the six 
arrays (Fig. 1) while still viewing the original stimulus. Their task was to point to the one array on the board 
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which was the same as the stimulus in front of them All children in all groups were told to look very 
carefully at the stimulus so that they could remember it exactly as it was ‘a long time from now’. Both the 
recall groups and the matching groups had to draw the original array from memory one week later and again 
eight months later. At those times, however, children in the recognition groups, instead of drawing what they 
had seen earlier, were given the recognition test consisting of pointing to the one array of the six on the 
stimulus board which they recognized. The sequence of the essential tasks for each group was, 


Recall Groups: Copy, draw one week later, draw eight months later 
Matching Groups: Copy, match, draw one week later, draw eight months later 
Recognition Groups: Copy, point one week later, point eight months later 


It should be noted that the children in the recognition groups had not seen the board with the six arrays 
until the time of their first recognition test, and children in the recall groups did not see such arrays until 
eight months later. Thus, children in these recall groups were never exposed to a seriated display of any kind 
unless they were in the group for whom such a display was the original stimulus to be remembered Only 
the matching groups were exposed to the various arrays of the large stimulus board during the initial testing. 

A seventh group (the ‘disordered’ display group), was not shown any stimulus figure, but merely given 
ten loose sticks to play with which were identical to those mounted in the seriated and random displays. 
They were then asked to draw them as if they were in a row. After their drawing had been put aside, they 
were asked to look at the disordered sticks so that they could remember them. One week later and again 
eight months later they had to draw what they had seen from memory. 

At the conclusion of all testing, children were given the Piagetian test of seriation ability to determine their 
operational level on that task. Giving a seriation test is necessary in order to establish an independent 
measure of operative ability (see Carey, 1971). However, as noted earlier it is essential that such a test be 
given only at the completion of the experiment so that its administration with the consequent emphasis of 
placing sticks in a seriated order will have no effect on the results of the basic memory phenomenon. Of the 
memory for seriation experiments reported, only that by Finkel & Crowley (1973) is free from this important 
source of contamination of results, but they neglected to get an independent measure of operative level at 
the end of the experiment. 

The assumption in this experiment is of a normal developmental progression in seriation ability, an 
assumption which does not violate any reported findings on operational seriation itself (as distinct from 
memory drawings and other related phenomena). This 1s supported by results from an experiment by Liben 
(1974) on memory for horizontality of liquid levels in tilted bottles. In her operational assessment tests, given 
to some groups both at the beginning and end of the experiment, only three out of 35 children showed a 
decline in their operational ability. Control for this factor in addition to the seven groups already represented 
in the design of this experiment was impossible given the available number of subnormal children in the 
subject pool, and the exactness of the subject's operational level was not considered as being as crucial as 
the necessity to avoid the child being exposed to pressures to seriate the material. Thus, the assumption 
throughout is that children were at or below therr seriation level at earlier points in the experiment. A 
non-seriator on the operational test given at the end of the experiment, was assumed to be a non-seriator at 
earlier points in time. Intermediate children and those who had operational seriation were considered to be at 
or below their final level at the earlier points in time. 

It should be noted from the design of this experiment that any differences in the memories of the matching 
and recall groups would be attributable to children in the matching groups having seen the various displays 
on the board which children in the recall groups had never seen. Thus, viewing the patterns on the display 
board was controlled, and by postponing any test of seriation ability until the completion of the experiment, 
children were not exposed to any pressures towards increased organization of their perceptions or memories. 


Results 

Seriation ability 

The children were categorized on the basis of the Piagetian seriation test at the completion of 
the experiment, eight months after original testing. Non-seriators were defined as children who 
completely failed, or who attained some trial and error ordering but without the base remaining 
horizontal. This corresponds to stages I and II in Piaget & Inhelder (1973, pp. 29-30). Children 
were classified as being in the intermediate stage if they achieved a trial and error success, but 
were unable to insert correctly the additional sticks into their seriation. These are identical to 
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Table 1. Direct copy of the seriated stimulus: Number and percentage of children of three 
seriation levels drawing various patterns 














Type of copy 

Random Primitive Near-seriation Seriation 

No. % No. % No. % No. % 
Children seeing serrated stimulus 
I (Non-seriators), n = 11 1 9-1 3 27:3 3 27-3 4 36-4 
II (Intermediates), n = 6 1 16-7 0 — 3 50-0 2 33.3 
III. (Seriators), n = 12 0 — 0 — 4 33.3 8 66-7 








* $= 88, Tau = 0-314, z = 1-96, P< 0-03, one tailed. 


Piaget's stage III children. The third group consisted of children who were able to perform 
"operational" seriation and incorporate the additional sticks into their series, and are listed by 
Piaget as stage IV children on this task. When categorized in terms of the seriation task, no 
developmental trend with chronological age is observable in these educationally subnormal 
children, the 27 non-seriators having a median CA of 10:0, the 19 intermediates having a median 
CA of 9:4 and the 21 seriators having a median CA of 10:2. However, there is a marked trend 
with mental age measured by the PPVT (median test, y? = 10-10, P< 0-01), with non-seriators 
being 5:5, intermediates 5:11, and seriators 6:6 years old mentally. 


Copying 
The children's drawings were classified into five major categories: 

1. disordered - representation of some sticks, but not in a row; 

2. random - sticks drawn in a row, but randomly arranged as regards length; 

3. primitive — sticks grouped into sets of short and long sticks, similar to the arrangements 
shown in Fig. 1(a) and (f); 

4. near-seriations — attempts to seriate the tops of the sticks with the bottom ignored, or the 
whole made into an arrow pattern similar to that shown in Fig. 1(c). Seriations using five or less 
elements were also classified as near-seriations; 

5. seriations - six or more sticks drawn in a seriated pattern, regardless of whether the 
drawing retained the left-right orientation of the original stimulus or was a mirror image of it. 
Two judges independently classified the drawings. There was a 94-5 per cent agreement between 
their classifications, the remaining differences being resolved by a third judge. 

The results of the copying task by the 29 children seeing the seriated stimulus are shown in 
Table 1 in terms of their seriation level eight months later. It can be seen that the later seriation 
level of the child is related to the quality of his direct copy of a seriated pattern. Thus, 66-7 per 
cent of seriators had copied the seriation pattern accurately. But 50-5 per cent of the 
intermediates copied the same pattern as a near-seriation, with only 33-3 per cent being able to 
make a correct copy of the stimulus. Non-seriations were the only group who drew primitive 
patterns as their copy (27-3 per cent). It should be noted that in this simple, direct copying task, 
some 36-4 per cent of the non-seriators were correctly able to copy the seriated pattern, which is 
the largest percentage of those children. But this percentage is much below those found in the 
two more advanced groups. Using Kendall's Tau (Kendall, 1955), the trends in Table 1 were 
found to be significant (z = 1-96, P< 0-03, one-tailed). 

Children viewing the random stimulus were able to copy that pattern accurately, which would 
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Table 2. Recall drawings of stimulus at one week and eight months: Number and percentage of 
children of three seriation groups drawing various patterns as their memory 


No memory 
or other Near- 
design Disordered Random Primitive seriation Seriation 





Children shown seriated stimulus 


At 1 week: 
I (n=8) 0 — 0 — 3 375 0 — 3 37.5 2 25-0 
II (n= 5) 0 — 0 — 0 — 0 — 2 40-0 3 60-0 
III (n 2 9) 0 — 0 — 0 — 0 — 2 22 7 T8 
Totals (n = 22) 0 — 0 — 3 136 0 — 7 31:8 12 545 
At 8 months 
I (n=8) 1 125 0 — 1 125 3 375 2 25:0 1 125 
II (n=5) 0 — 0 = 0 — 0 — 4 80-0 1 200 
III (n = 9) 0 — 0 — 0 — 0 — I 11-1 8 88-9 
Totals (n = 22) 1 45 0 — 1 45 3 136 7 31-8 10 455 
Children shown random stimulus 
At 1 week: 
I (n= 10) 0 — 0 — 7 70 0 — 3 30-0 0 = 
II (n26 0 — 0 — 4 66.7 1 167 0 — 1 167 
III (n= 4) 0 — 0 — 3 75.0 0 0 — 1 25-0 
Totals (n — 20) 0 — 0 — 14 70-0 1 $0 3 15-0 2 100 
At 8 months: 
I (n=10) 1 100 2 200 4 400 1 10.0 2 20-0 0 — 
II (n=6) 2 333 0 = 1 167 2 333 0 — 1 167 
III (n = 4) 0 — 0 — 4 100-0 0 — 0 = 0 — 
Totals (n — 20) 3 150 2 10-0 9 45-0 3 150 2 10-0 1 5.0 
Children given disordered sticks* 
At 1 week (n = 7) 2 286 2 286 0 i 1 143 0 — 2 286 
0 = 49 0 = 1 143 


At 8 months (n = 7) 2 28-6 1 14-3 





Note: I = Non-seriators, II = Intermediates, III = Seriators. 
* Due to small numbers, groups I, II and III are combined. 


seem to be evidence of their possessing the purely mechanical aspeéts of drawing ability. Even 
here, however, two of 14 non-seriators produced primitive or near-seriation patterns as their 
direct copy and one of the 12 intermediate children produced a near-seriation pattern. 


Matching 


There were only 11 children of all seriation levels in the matching group who were shown the 
seriated stimulus. Of these, ten were correctly able to match the seriated stimulus, although it is 
interesting to note that two of these pointed to the mirror feversed seriation as identical, a point 
which will be noted in the discussion of the composition of thé memory image itself. The only 
child who did not match with a seriated pattern choice was a non-seriator whó pointed to a 
near-seriation (arrow) design Fig. 1(c). 

Of the nine children who were shown the random stimulus, three pointed to a primitive 
ordering as their direct match. Thus, some children imposed an ordering even while viewing the 
stimulus material. 
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Table 3. Number and percentage of seriating and non-seriating children making a seriated 
drawing having originally viewed the seriated stimulus 





I Non-seriators (n = 8) III Seriators (n = 9) 

No. % No. 926 
Copy 4 50-0 6 66-7 
One week 2 25-0 7 77-8 


Eight months 1 12-5 8 88-9 








Recall (recall and matching groups) 


The data for thè recall drawings for all groups both at one week and at eight months are 
presented in Table 2. In this table, recall and matching groups have been combined as it was 
observed that the results for the two groups were similar, i.e. there appeared to be no effect on 
children in the matching groups of having seen various displays on their later memory drawings. 

The results for the children who originally had been shown the seriated stimulus are found in 
the upper part of the table. It can be seen that intermediates and seriators drew mainly correct 
seriations from memory after one week, while most non-seriators drew random or near-seriation 
patterns. After an eight-month interval, however, there was some small improvement in the 
seriator group in that now eight rather than seven children drew seriations. The drawings of the 
other two groups deteriorated, and intermediates now overwhelmingly drew near-seriation 
patterns, while non-seriators mainly recalled primitive or near-seriation patterns. It should 
be remarked that three out of ten children who drew seriations after the eight-month interval, 
produced a mirror reversal of what they had originally been shown. These are counted as correct 
seriations, but the significance of these mirror reversals will be mentioned in the discussion. 

Changes in drawing the seriation stimulus over time are more clearly seen by comparing 
children in the two extreme groups, non-seriators and seriators. Table 3 shows the 
developmental course of the drawings of children in these groups who had viewed the seriated 
stimulus. It should be noted that as these are drawings by the same children over time, the 
numbers at the time of copying are less than shown in Table 1 which included also the copies 
made by children who were later in the recognition group and who thus did not make later 
drawings. Children who are at the operational level of being able to seriate at the end of the 
experiment showed improvement in their ‘memory’, with more children drawing a seriation 
pattern after an eight-month interval than were able to copy it correctly while viewing it! 
Although the numbers are small partly due to a ceiling effect, any improvement goes against the 
more usual expectation of the decay of memory tracés. In direct contrast, the non-seriators grew 
consistently worse. Half of them had the ability to copy the seriation while viewing it, and this 
did not differ from the 66-7 per cent of seriators who were able to draw seriations as their copy. 
By one week, however, the nón-seriators had deteriorated and the seriators had improved so 
that the differences between them were significant at the 0-05 level. And at eight months, only 
one non-seriator ‘remembered’ the seriation, while of the seriators, only one child did not 
correctly ‘remember’ it, and this difference was significant at the 0-003 level. 

Returning to Table 2, in the centre portion the results for all children seeing the random 
stimulus are presented. It can be seen that most children at all three seriation levels recalled a 
random pattern at one week. But even here one notes that six out of a total of 20 children (30-0 
per cent) imposed some order, either primitive groupings, near-seriations, or even seriations on 
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Table 4. Recognition of stimulus at one week and eight months: Number and percentage of 
children pointing to various patterns as their memory 








Pattern chosen as recognition 





Random (e) Primitive (a) or (f) Near-seriation (c) Seriation (b) or (d) 
No. % No. % No. % No. 96 





Children shown seriated 
pattern (b) (n = 7) 
At | week 0 — 1 14:3 1 14-3 5 71-4 
At 8 months 1 14.3 1 14-3 1 14-3 4 57.1 
Children shown random 
pattern (e) (n = 11) 
At 1 week 7 63-6 2 18-2 0 — 2 18-2 
At 8 months 5 45.5 2 18-2 1 9-1 3 27-3 





Note: Letters refer to displays as illustrated in Fig. 1. 


their memory, After eight months, all seriators recalled the random pattern, but six of the 16 
other children (37-5 per cent) imposed some kind of ordering in their memory. 

Similar results were obtained from the group of children who merely saw the disordered sticks 
(bottom of Table 2). At one week, 42-9 per cent imposed an ordering on their recall, and at eight 
months, 57-2 per cent did so. 


Recognition 


The results from children in the recognition groups are shown in Table 4. The numbers are too 
small to be meaningfully divided by operational level. Overall, there was some deterioration of 
memory. Of those who had seen the seriated stimulus, 71-4 per cent correctly recognized it after 
one week, and 57-1 per cent recognized it after eight months. What is interesting however, is 
that of those who had actually seen a random stimulus, 36-4 per cent nevertheless ‘recognized’, 
at one week, some type of ordered stimulus as the one they had seen. And after eight months, 
54-6 per cent pointed to an ordered stimulus as their ‘recognition’. Indeed, 27-3 per cent actually 
‘recognized’ the seriated pattern as the one they had seen eight months earlier, although in fact 
they had been shown a randomly arranged pattern. 

Another possible way to study a supposed memory image would be to see how closely this 
image corresponds to the original stimulus. In this experiment, one of the displays on the 
recognition board was a mirror-image reversal of the original seriated stimulus. Thus, pointing to 
either (b) or (d) was credited as pointing to a seriated display. But it is interesting to note in the 
group who had actually seen the seriated display, that two of the five who ‘recognized’ it after 
one week, in fact ‘remembered’ the mirror-image of the one they had been shown. At eight 
months, one of the four pointing to a seriation, was ‘remembering’ a mirror-image of the 
original. 


Discussion 


The results of this experiment provide support for the findings by Piaget & Inhelder (1973) that 
children’s memory drawings of a seriated display improve over time as their cognitive abilities 
develop. This support at first glance appears limited since only a few children improved their 
memory drawings over time. But these children were seriators. Others, who did not improve, 
and in fact did worse after eight months, were not merely ‘forgetting’; rather, they drew as their 
memory more primitive orderings which corresponded to their non-seriation level. This 


Children’s perceptual organization 173 


divergence is most clearly shown in Table 3 where it can be seen that seriators made more 
accurate seriation drawings after eight months than when they had attempted directly to copy 
the stimulus, whereas non-seriators deteriorated considerably with only one child being able to 
produce a seriation drawing after eight months. These results thus give an affirmative answer to 
the first of the specific questions posed in the introduction, and in addition show a deterioration 
in the drawings by some children, the non-seriators, as compared to their direct copy. 

The results of this experiment also support the findings of Altemeyer et al. (1969) that the 
children viewing a randomly ordered set of sticks will be likely to remember them as more 
ordered after the passage of time. Finkel & Crowley (1973) had also found this to be the case 
with children presented with what here are labelled primitive patterns as the stimulus. In the 
present experiment it was found that children shown only the random presentation or merely a 
set of disordered sticks, also moved to so-called more ‘advanced’ drawings of these sticks in that 
they imposed an ordering, often of a seriation or near-seriation type, over time. The same results 
were obtained, and to an even more striking degree, in the recognition group viewing a random 
stimulus (see Table 4). The second and third questions posed in the introduction are thus also 
answered in the affirmative. These subnormal children imposed seriated orderings on the random 
stimulus, and this occurred even when there had been no previous pressure to seriate the 
material, as in a prior test of seriation ability. 

That the changes in recall drawings are not due to advances in drawing ability is revealed by 
the copying task. Half of the non-seriators in Table 3 were able to make a seriation drawing at 
the time that they viewed the stimulus, and this did not differ significantly from the 66-7 per cent 
of the seriators who were able to do so. But one group's drawings improved over time, while the 
other group's drawings steadily declined. Other experimenters (Dahlem, 1969; Finkel & Crowley, 
1973) have found similar improvements and deteriorations. As Finkel & Crowley gave no 
seriation test, they were unable to link their results to the child's operative level. Dahlem 
reported that it was those children who made imperfect reproductions who were seen to 
deteriorate. The results in the present experiment suggest that the answer to the fourth question 
asked in the introduction is that changes over time are not merely due to improved drawing ability 
with age. Some children's drawings improve — but only those who are or become seriators. A 
few non-seriators had the ability to draw a seriation when directly viewing the stimulus, but their 
drawings deteriorated with time. This interpretation is further supported by the recognition 
results. 

The copying task also provided other interesting results. Although some children, as just 
noted, were able to make drawings in advance of their seriation ability level when directly 
viewing the stimulus, many children made drawings which corresponded to their seriation level 
even when they attempted to make direct copies. Thus, 27-3 per cent of non-seriators drew 
primitive patterns as their copy, a majority of intermediates drew near-seriation patterns, while a 
majority of seriators drew seriations. It would appear that the operational level of the child even 
determines his encoding of the stimulus at the time he makes a direct copy, as suggested in the 
fifth question listed in the introduction. 

Moreover, it appears that there is no storage of a direct memory image. Though this is most 
clearly seen in the results of children shown a random stimulus or a set of disordered sticks who 
later imposed orderings in their memory drawings or recognition choices, it is also apparent in 
children shown the seriated stimulus. Even children who were seriators and who recalled or 
recognized the seriation pattern as the one they had been shown, often drew or pointed to a 
mirror-reversed pattern of the actual original stimulus. This could not occur if the child were 
merely reading off some kind of ‘photographic’ memory image. What is stored is still a mystery, 
but what one can conclude is that it is probably incorrect to view memory in terms of stored 
images which change over time, or even of stored images which are decoded differently at 
different levels of cognitive development. It might be that children are merely changing to a good 
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Gestalt over time. Such a non-developmental notion is made unlikely, however, by the observed 
deterioration of drawings by children who were non-seriators as opposed to the improved 
drawings by children who were or became seriators. Such results imply a definite link between 
seriation ability and ‘memory’ drawings. If improved drawings over time were due merely to the 
activation of tendencies toward a good Gestalt as suggested in the sixth question in the 
introduction, there would be no reason for this observed difference between the groups. 

Piaget’s position that what is retained is actually some sort of symbolic representation which 
undergoes change over time seems to be a reasonable hypothesis but his relative lack of 
emphasis on encoding processes had led others to interpret his position in an oversimplified 
manner as being purely a memory phenomenon, with most attention paid to the memory image 
itself. In this experiment, however, the results of the direct copying and matching groups 
showed that the very act of perception itself is an active one, and directly tied in this case to the 
kinds of cognitive functioning that Piaget has hypothesized. In essence, this is very close to the 
‘construction’ hypothesis proposed by Neisser (1957), where he states; ‘There are no stored 
copies of finished mental events. . .but only traces of earlier constructive activity. .. The traces 
are not simply 'revived' or 'activated' in recall; instead, the stored fragments are used as 
information to support a new construction’ (pp. 285-286). 

Finally, as has been already noted, the present experiment has shown that mildly retarded 
educationally subnormal children performed at the same levels of seriation ability and at similar 
mental ages, as children of normal intelligence. This is not surprising. Hermelin & O'Connor 
(1961) found, for example, that even severely subnormal children with a mean IQ of about 35, 
performed like MA-matched controls of normal intelligence on matching, recognition, copying, 
and reproduction tasks. In addition, this experiment has also shown that mildly retarded children 
behave on the basic ‘memory’ tasks for seriated displays in the same way as normal children in 
other experiments. There is therefore no reason to believe that any of the new results brought out 
by the particular experimental design would differ in normal children as well. 
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Memory performance after arousal from different sleep stages 


M. J. Stones 


Learning material was presented to independent groups of subjects either after arousal from non-Rapid Eye 
Movement (non-REM) sleep, after arousal from REM sleep, or under conditions of no prior sleep. Measures 
of immediate and subsequent free recall were taken. 

Memory performance was found to be impaired where learning took place after non-REM arousal. This 
was manifest in the number of categories recalled, over both immediate and subsequent recall, and in the 
number of items recalled per category over subsequent recall. It was suggested that the memory performance 
decrement after non-REM arousal may be understood in terms of a retrieval deficit as well as a coding 
deficit. It is possible that the former is consequent upon a lower general level of arousal, whereas the latter 


is specific to memory. 





In an early study, Worchel & Marks (1951) found trials to criterion to be higher when list 
presentation was preceded by a period of sleep, rather than wakefulness, although there was no 
significant effect on trials to relearn the following morning. Since that time, there has been little 
further attempt to investigate directly the proactive effects of sleep on levels of learning and 
subsequent retention. However, such an attempt was made in a study by Stones (1973). 

In that experiment, a repeated measures design was employed with two conditions. In one 
condition, list presentation was preceded by arousal from a period of non-Rapid Eye Movement 
(non-REM) sleep, whereas in the other condition, no sleep preceded list presentation. In the 
prior sleep condition, subsequent free recall, but not immediate free recall, was found to be 
impaired relative to the no sleep condition. In addition, overt measures of rehearsal and recoding 
were taken at the time of list presentation and both showed impairment under the prior sleep 
condition. It was concluded that a coding deficiency at least contributed to the memory 
impairment under the prior sleep condition. 

The broad intention of the present study was to attempt to extend further knowledge of the 
effects of prior sleep on memory performance. First, it was questioned whether any memory 
deficit would be associated with all sleep stages equally, or whether the various stages would 
exert proactive effects of differential severity. In this connection, there are reasons for predicting 
differential effects. Differential effects have been reported with respect to the cancellation task 
(Fort, 1972), oculomotor control (Berger & Walker, 1972), the rod and frame test (De Koninck, 
Koulack & Oczkowski, 1973) and the spiral after effect (Lavie, 1974). Scott & Snyder (1968) 
employed several performance tasks and obtained differential effects during the first half of the 
night’s sleep. On the negative side, Tebbs (1972) failed to find differential effects, with respect to 
visualization performance. In those studies where differential effects were obtained, performance 
was generally found to be higher after arousal from stage REM, than from non-REM sleep. 

Second, an attempt was made to assess whether any memory deficit after sleep arousal would 
be characterized by a failure to recall entire ‘chunks’, or higher order units, by a failure to recall 
higher order unit members, or both, i.e. whether fewer chunks would be recalled, fewer items 
per chunk, or both. One suggestion, based on the work of Tulving and his colleagues, is that 
retrieval efficiency becomes manifest in the number of chunks recalled whereas coding efficiency 
becomes manifest in the number of items recalled per chunk (Craik & Masani, 1969). This 
suggestion will be discussed more fully at a later point. 

The present experiment involved an independent groups design, one group being aroused from 
non-REM sleep prior to list presentation, a second group being aroused from stage REM, whilst 
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a third group experienced no prior sleep. The list consisted of conceptually categorizable 
material arranged in blocked manner, i.e. items from the same category were adjacently ordered. 
All subjects attempted both immediate free recall and subsequent free recall. 


Method 
Materials 


The learning task consisted of a list of 15 words presented orally by means of a tape-recorder. The words 
were obtained from the list of Black & Ausherman (1955) and were of fairly low frequency of occurrence in 
spoken English. They were: divisible into five easily identifiable conceptual categories, each category 
containing three words. The list was read out twice in the form of word triplets (each triplet comprising a 
specific conceptual category) with the same triplet and within-triplet order on each occasion. The words in 
each triplet were read out over a 2 sec period, with a 5 sec pause between triplets. 


Subjects 


Twenty-four paid, male student volunteers came individually or in pairs to the sleep laboratory. They were 
evenly and randomly divided into three groups: (i) no prior sleep (NPS); (ii) arousal after termination of the 
first REM period of the night (RA); (iii) arousal after 30 min non-REM sleep at a level of stage 2 or deeper 
(NRA). In general, on those occasions where the subjects came to the sleep laboratory in pairs, one was 
assigned to the NPS group and the other to the NRA group. 


Procedure 


Subjects in the NRA group were presented with the word list 1 min after arousal from a period of 30 min 
non-REM sleep at a level of stage 2 or deeper. Subjects in the RA group were presented with the word list 

1 min after arousal at the termination of their first REM period of the night. The REM period was considered 
to have terminated if a gross body movement occurred, followed by onset of sleep stage 2. The subjects 
were aroused approximately 10-20 sec after- onset of sleep stage 2. Subjects in the NPS group experienced 
no sleep prior to list presentation. In the NRA and RA groups, sleep stage scoring was carried out according 
to the method of Rechtschaffen & Kales (1968) and sleep was monitored on an Elma Schonander EEG, 
housed in a room adjacent to the bedroom. 

An attempt was made to control for circadian variation by roughly equating the times of list presentation 
over the three groups. To this end: (a) subjects in the NRA group retired to bed at approximately 11.15 p.m. 
This was, on average, 30 min later than for subjects in the RA group (since the first REM period of the night 
commonly occurs after one hour of non-REM sleep); (b) if subjects came to the sleep laboratory in pairs, 
both were presented with the word list at the same time; (c) subjects in the NPS group who came 
individually to the sleep laboratory were presented with the word list at approximately the same time as 
subjects in the other groups. 

After list presentation, immediate, written free recall was attempted, with a 75 sec time limit imposed. 
Subjects then worked at Raven’s Standard Progressive Matrices (Raven, 1960) for 20 min before attempting 
subsequent written free recall, with a 120 sec time limit imposed. The time limits had been shown to be 
adequate in a pilot study. 


Results 


As expected, the RA and NRA groups differed in terms of sleep-related criteria. The RA group 
averaged 80 min non-REM sleep and 7 min REM sleep, whereas the NRA group averaged 30 min 
non-REM sleep. For the RA group, approximately half of the non-REM sleep could be classified 
as ‘slow wave’, i.e. stages 3 plus 4, whilst the corresponding proportion was approximately 
two-thirds for the NRA group. 

With respect to the memory performance data, a measure of category clustering (Roenker, 
Thompson & Brown, 1971) was taken in order to assess whether subjects were coding in 
accordance with the conceptual organization of the list, or on an idiosyncratic basis. Since seven 
subjects from each group obtained maximal scores, it may be assumed that the conceptual 
categories provided the basis for coding. 

The memory performance data was next analysed with respect to: (1) the number of 
conceptual categories from which recall was obtained; and (2) the mean number of items recalled 
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from these categories. The latter was obtained by dividing the number of items recalled by the 
number of categories recalled. The mean values over groups are presented in Tables 1 and 2. 
Since the assumptions required for parametric analysis could not be met on all occasions, it was 
decided to employ non-parametric analysis throughout. The Kruskal-Wallis one-way analysis of 
variance was utilized for three-group comparisons and the Mann-Whitney U test for two-group 
comparisons. Level of significance was P « 0-05 (two-tailed). 

With respect to the number of categories recalled, Table 1 shows significant differences over 
both immediate and subsequent recall. Mann-Whitney U tests revealed the NRA group to obtain 


Table 1. Mean number of categories recalled 











Experimental groups 
Time of nm-——— —— 
recall NRA NPS RA Statistical comparison 
Immediate 3475 5 5 H= 11-76, d.f. 22, P« 0-01 
Subsequent 3-50 4-88 4-88 H= 13-50, d.f.=2, P<0-01 








lower scores than the NPS and RA groups on both occasions (all four comparisons significant at 
P< 0-005), whereas the latter two groups differed on neither occasion. 

With respect to the number of items per category (Table 2), differences across groups were 
obtained over subsequent recall but not immediate recall. Individual comparisons of the 
subsequent recall scores showed the NRA group to score lower than either the NPS group 
(P« 0-01) or the RA group (P< 0-05), with the latter two again failing to reveal evidence of a 
significant difference. 


Table 2. Mean number of items recalled per category 

















Experimental groups 
Time of mmm 
recall NRA NPS RA Statistical comparison 
Immediate 2:63 2-82 2-67 H — 4:88, d.f. —2, n.s. 
Subsequent 2-16 2:69 2-59 H=8-44, d.f. =2, P<0-02 
Discussion 


It has been pointed out (Stones, 1973, 1974) that interpretation of the Worchel & Marks (1951) 
study may have been confounded by the effects of uncontrolled variation consequent upon a 
sleep-filled retention interval, i.e. time to sleep onset and differential retroactive effects of sleep 
stages. This difficulty was circumvented in both the Stones (1973) and present studies, where the 
retention interval was filled by wakefulness. Circadian effects potentially provide a further 
source of confounding variation, but attempts at control were relatively successful in all three 
studies. 

The present results further the findings of Worchel & Marks (1951) and Stones (1973) in that 
the memory performance decrement was not evenly distributed over all sleep stages, but was 
greater after arousal from non-REM sleep. In fact, performance after arousal from stage REM 
was so high as to be indistinguishable from that obtained under the no sleep condition, although 
a tendency towards ceiling effects may have obscured a slight but real difference. 
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As mentioned in the introductory section, however, differences in level of performance, after 
REM and non-REM arousals, have been achieved with a variety of tasks. Such differences might 
be expected on electrophysiological grounds, since there is evidence for lower levels of tonic and 
phasic central nervous system (CNS) activity after arousal from non-REM sleep (Hubel, 1959; 
Evarts, 1960; Walsh & Cordeau, 1965; Broughton, 1968). It is therefore probable that differences 
in performance, between the RA and NRA groups, are not confined to memory but also reflect 
differences in general level of arousal. This line of reasoning can be further explored by a 
consideration of the two main measures of memory performance. 

It has been argued that the number of chunks recalled provides a measure of retrieval 
efficiency, whilst the number of items per chunk can be considered as an indicant of coding 
efficiency (Craik & Masani, 1969; see also Lawrence, 1967; Johnson, 1972, 1973). This is based 
on evidence that items may be available for retrieval, but inaccessible unless appropriate cues 
are present (Tulving & Pearlstone, 1966; Tulving & Madigan, 1970). In a complementary fashion, 
it has been found that the number of items recalled per chunk increases under conditions where 
more efficient coding might be expected (Tulving, 1966). That high category clustering scores 
were obtained in the present study, makes it clear that the two main measures of memory 
performance can be regarded as equivalent to those discussed by Craik & Masani (1969). 

The results obtained can therefore be suggested to indicate both lower retrieval efficiency and 
lower coding efficiency under non-REM arousal conditions (the former being evidenced by low 
category recall and the latter by low recall of category members). With regard to the former, 
Eysenck (1975) has suggested that the retrieval component of recall is affected by general level 
of arousal. Since low category recall was noted over immediate as well as subsequent recall, the 
possibility arises that the retrieval impairment was consequent upon a lower level of general 
arousal. 

However, it is difficult to argue that the lower recall of category members was much influenced 
by general arousal effects. The NRA group performed at a comparable level to the other groups 
during immediate recall, where differences in general arousal might be expected to be maximal. 
It is probable that this measure reflects a loss of coding efficiency, after non-REM arousal, that 
is specific to memory. This aspect of the results replicates those of Stones (1973), i.e. differences 
during subsequent recall only, where direct evidence for lower coding efficiency was obtained. 

In conclusion, the present findings suggest that REM and non-REM sleep exert differential 
proactive effects on memory performance. Theoretical understanding is further enhanced by the 
suggestion of a retrieval deficit as well as a coding deficit after non-REM arousal. It is possible 
that the former is consequent upon a lower level of general arousal, whilst the latter is specific to 
memory. 
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Verification processes in recognition memory: The role of natural language 
mediators 


Philip H. Marshall and Randolph A. S. Smith 





The existence of verification processes in recognition memory was confirmed in the context of Adams’ 
(Adams & Bray, 1970) closed-loop theory. Following a session of learning 100 CVCs by either a mediation or 
rote strategy, subjects’ recognition was tested. Retention of items, learning strategy, and possible NLMs 
were assessed, as was the confidence associated with each decision. The expectation was that the data would 
reveal consistent internal relationships supporting the position that natural language mediation plays an 
important confirmatory role in recognition performance. The data are relevant to discussions of the 
relationship between recognition and recall, and the existence of retrieval processes in recognition memory. 





The prevailing sentiment concerning recognition and recall tasks has been that they reflect 
fundamentally different memory processes. In the most clear statement of the alleged distinction, 
Kintsch (1970) states that the basic difference between the two measures is that recall involves a 
search process whereas recognition does not. Many studies, in various contexts, have supported 
this argument (Estes & DaPolito, 1967; Kintsch, 1968; Postman, Stark & Fraser, 1968). 

More recently, however, several theorists have offered accounts of the recognition act which 
incorporate the operation of retrieval processes at certain levels (Mandler, 1972; Atkinson, 
Hermann & Westcourt, 1974; Atkinson & Juola, 1974). Mandler (1972) concedes that some items 
may be recognized solely on the basis of familiarity or occurrence information, but that other 
items lacking in sufficient familiarity or occurrence information are subjected to a retrieval 
check. In Mandler’s research with categorized lists (Mandler & Boeck, 1974), this check 
‘involves the identification of the appropriate category of the to-be-recognized item and a 
comparison between that item and...items from the stored category’ (p. 613). 

Most relevant to the present research is the closed-loop theory of paired-associate learning 
offered by Adams (Adams & Bray, 1970). Adams hypothesizes various levels in the recall 
retention process where comparison and verification checks are conducted. Adams’ closed-loop 
theory considers that, in the test phase where the stimulus term alone is presented, a potential 
response is generated by a stimulus and the appropriateness of that response is determined by a 
comparison with perceptual representations of responses laid down during acquisition. The 
perceptual representations of a response are essentially the memories of previous stimulation 
associated with that response. They may be from any sensory modality and may include 
proprioceptive feedback from the actual saying of the response (Adams, McIntyre & Thorsheim, 
1969). A similar comparison process is assumed for stimulus term recognition as well. Additional 
verification is provided for items that have been learned with the aid of a natural language 
mediator (NLM), in that the NLM, if available at the time of retention testing, provides an 
additional source for confirming the appropriateness of the response generated at the time of 
recall. An example of an NLM is the use of the word ‘chemistry’ to help learn the syllable 
KEM. 

While Adams' theory focused on paired-associate recall, certain features may be extended to 
single-item recognition, the focus of the present study. To illustrate, if an item is learned with 
the aid of a NLM, the occurrence of the NLM at the time of recognition testing may serve as a 
verifier of the subject's choice (OLD/NEW). Items learned without the aid of a NLM would not 
have this additional verification system and may be less well recognized. Moreover, the subject's 
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confidence in his choice (OLD/ NEW) should be highest when the NLM is available during 
recognition, and mediated items with NLMs available should have higher confidence ratings than 
mediated items with the NLM not available. The functioning of such a processs is related to the 
role of context maintenance emphasized by several researchers (Light & Carter-Sobell, 1970; 
Thomson, 1972). 

Marshall, Chatfield & Janek (1975) have confirmed some role for mediational processes in 
recognition in that CVC pairs learned by mediation had a higher probability of being correctly 
recognized than pairs learned by rote. The simplicity of that study, however, did not allow the 
complexities of the verification process to be determined. 

The purpose of the present study was to provide an internal analysis of the many potential 
functions of NLMs within item recognition. Of particular concern will be the confirmation and 
localization of the verification process. Rather than list these functions here, each hypothesized 
relationship will be briefly described below, as is appropriate. 


Method 
Subjects 


The subjects were 40 undergraduate students enrolled in introductory psychology courses at Texas Tech 
University. Participation in the experiment was in partial fulfillment of course requirements. Subjects 
participated five at a time in two one-hour sessions scheduled three days apart. 


Materials 


The stimulus material consisted of 200 CVCs obtained from Noble's (1961) norms and ranged from 2-23 to 
2-5] on the associability scale, with a mean value of 2-37. 
Acquisition 
The subjects first heard instructions describing rote and mediational (NLM) learning strategies. They were 
instructed to learn a list of 100 CVCs, using the strategy that they preferred for each CVC. If a particular 
CVC was learned by mediation, the subject was asked to write the NLM on a sheet of paper. The CVCs 
were shown sequentially by a Carousel 850-H projector for a duration of 15 seconds each. 

After the subjects had engaged in learning the CVCs, they were given an arithmetic test, supposedly to 
measure the effect of prior learning of verbal materials on the accuracy of addition. This 'test', and the 


omission of specific reference to the subsequent retention test, were designed to prevent the subjects from 
discerning the true nature of the experiment and from rehearsing the CVCs over the retention interval. 


Retention 

After a three-day retention interval, the subjects saw a sequence of 200 CVCs; the 100 CVCs that they had 
seen earlier (OLD) and 100 that they had not seen before (NEW). The CVCs serving as stimulus items were 
counterbalanced between groups. Half of the subjects saw one group of CVCs as learning material and the 
other CVCs as distractor items on the recognition test. Two different random orders of presentation were 
used for each group of stimulus items. Subjects viewed each slide for 15 seconds and were aware that half 
of the items were targets and half distractors. 

Subjects were instructed to indicate, for each CVC, whether it was OLD or NEW, and then to indicate 
their confidence in that decision from 1 (low) to 5 (high). If the CVC was called OLD, subjects were asked 
to indicate whether it was learned by rote or by mediation and to indicate their confidence in that decision. 
Finally, if the CVC had been called OLD and judged as originally learned by mediation, subjects were asked 
to provide their NLM on the answer sheet and to give their confidence that the NLM written was actually 
the one originally used during acquisition of the CVC. Thus, a maximum of three responses, each with a 
confidence rating, could be required of a subject for each item on the retention test. 


Results and discussion 

It is inevitable that when 40 subjects are asked to respond at several levels to 200 items with a 
maximum time allotment of 15 seconds per item that some information will be lost due to subject 
omission. This accounts for the total OLD items (presented below) summing to 3997 rather than 
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4000. For the remaining analyses, as well as for tabular entries, the magnitude of missing 
observations is small, and it is extremely doubtful that, had the omissions not occurred, the 
results would have been affected significantly. 

During the acquisition phase subjects learned 69-85 per cent of the CVCs (2792) with the aid of 
a NLM; 30-15 per cent (1205) were learned by rote. On the recognition test, 82:23 per cent of the 
mediated items were correctly recognized as OLD, whereas only 67-55 per cent of the rote items 
were correctly recognized. For a t-test comparison, the score for each subject was the 
proportion of mediated and rote items correctly recognized, and the test excluded subjects who 
had all mediated or all rote items. That comparison yielded a significant difference, t= 5-88, 
d.f. — 33, P« 0-001, and supported the major closed-loop notion that mediated items should yield 
superior retention, and replicated the findings of Marshall et al. (1975). 


Confidence in OLD| NEW (item recognition) decision 


Closed-loop theory makes predictions concerning subjects' confidence in their decision of 

OLDI NEW. Correctly recognized items (hits) should have higher confidence ratings than 
incorrectly recognized items (false alarms and misses) since, presumably, the correct items have 
been matched successfully with a representation laid down during acquisition. Correct rejections 
(CRs) should also have a relatively high OLD/NEW confidence because there should be no 
memorial representation present to raise doubt. However, the absence of a verification should 
not inspire as high a confidence rating as the presence of a verifying representation. Table 1 
gives supporting data for these assumptions. Moreover, a t-test comparison indicated that hits 


Table 1. Mean confidence rating (OLD/ NEW) for items as a function of OLD/NEW decision 


Hits Misses FAs CRs 
Mean confidence 4-33 3-29 3-64 3-62 
No. observations 3102 872 1283 2680 


had a higher confidence than false alarms (FAs), t= 9-52, d.f. =39, P< 0-001 and that CRs had 
higher confidence ratings than misses, t= 6-05, d.f. = 39, P< 0-001. These results are consistent 
with closed-loop referential notions. 

Natural language mediators play an additional verification role in closed-loop theory. An item 
which was originally learned by mediation, and which has a NLM present at retention testing, 
can be further verified, whereas an item which was originally learned by rote can not. Table 2 
gives evidence supporting this contention. A t-test comparison indicated that mediated hits had a 
higher confidence than rote hits, t= 4:72, d.f. — 33, P« 0-001, which closed-loop theory would 
predict. The same mechanism is at work with FAs also. Although a FA item has never actually 


Table 2. Mean confidence rating (OLD/ NEW) for items as a function of original or perceived 
(for FAs only) learning strategy 











Mediated Rote Mediated Rote ‘Mediated’ ‘Rote’ 
hit hit miss miss FA FA 
Mean confidence 4-42 4-06 3-32 3-25 3-90 3-49 


No. observations 2289 813 484 388 510 756 
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been seen or learned, it is thought to have been. Therefore, it could have been perceived as 
‘learned’ by either mediation or rote. A comparison between mediated and rote FAs’ 

OLD/NEW confidence also showed a significant difference in favour of mediation, t= 3-95, 
d.f. 2 35, P« 0-001. The pseudo-NLM still served the purpose of raising confidence. There was 
no significant difference between misses which were originally learned by mediation and 

rote — perhaps because the mediated miss resulted from the absence of the NLM. In the main, 
these results are consistent with closed-loop notions concerning the verification role of NLMs. 

A subject could have correctly recognized an originally mediated item but could have supplied 
either the correct NLM, an incorrect NLM, or no NLM (an omission) on the retention test. For 
items that were mediated in acquisition, correctly recognized on the test as. OLD, and correctly 
classified as having been learned by mediation, an analysis was made of item recognition 
confidence (OLD/NEW) as a function of the mediator status. Of these mediated hits, 82 per cent 
had correct NLM recall with a mean OLD/NEW confidence rating of 4-60; 13 per cent had 
incorrect NLM recall with a mean OLD/NEW confidence rating of 4-47; and only 4 per cent of 
the items did not have any NLM recall (omitted NLM recall), with a mean OLD/NEW 
confidence rating of 3-61. Several t-test comparisons were made using the different combinations 
of OLD/NEW confidence ratings given above. An insignificant difference was obtained in 
OLD|NEW confidence between mediated hits given correct and incorrect NLM recall (P> 0-10). 
Significant differences were obtained for OLD/NEW confidence between mediated hits with 
correct NLM recall and omitted NLM recall, f= 2-55, d.f. = 14, P< 0-05, and marginal 
significance between incorrect NLM recall and omitted NLM recall, t= 2-01, d.f. — 11, P< 0-10. 
These analyses indicate that, if a NLM is available at the time of the test, the subject has an 
OLD|NEW confidence greater than if no NLM is available, regardless of the correctness of the 
NLM. Moreover, OLD/NEW confidence is increased by roughly the same amount even for an 
incorrect NLM. This fact is not necessarily at odds with closed-loop theory. For a mediated 
item, there exist two potential verification sources - the memorial representation of the item 
itself and the representation of the NLM. The presence of any NLM seems to increase 
confidence such that even an available incorrect NLM may verify the choice. The subject is 
looking for confirmation and may find it even in an incorrect NLM. A similar account has been 
offered to explain certain apparently anomalous findings in motor retention (Adams, Goetz & 
Marshall, 1972). At any rate, those NLMs that were supplied for mediated hits were 82 per cent 
correct. i 

In terms of the closed-loop theory, a FA represents an item which, at the time of the 
recognition test, though being new, has made a minimally successful comparison with the 
representation of an old item and is thus judged as OLD on the recognition test. The OLDINEW 
confidence of a FA item should be less than that of a hit, regardless of how the subject thought 
the item was learned since, at best, only a minimally successful comparison can be made. The 
mean OLD/NEW confidence of FAs thought learned by mediation (n = 510) was 3-90 and that of 
FAs thought learned by rote (n = 756) was 3-49. Both of these mean ratings are significantly 
different, respectively, from those of mediated hits judged learned by mediation (4:54, n= 1769, 
t=6-85, d.f. =38, P« 0-001) and rote hits judged learned by rote (4-02, n = 665, t= 3-84, 

d.f. 2 30, P< 0-001). 


Confidence in recalled NLMs 


The discussion so far has centred on measures of CVC recognition. Data are available, however, 
to test closed-loop notions concerning the efficiency of NLM recall. For mediated hits with 
correct NLM recall, the mean confidence in the NLM given was 4-32; for incorrect NLM recall, 
the mean confidence in the NLM given was 3-71. Logically, this rating is not possible for 
omitted NLMs. A t-test comparison yielded a significant difference, t= 6-59, d.f. = 32, P< 0-001, 
between confidence ratings of NLM recall for correct and incorrect NLMs. Once again this 
fulfills a closed-loop prediction and points to the fact that, although an incorrect NLM may 
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increase confidence in item recognition (as indicated above), subjects do have the ability to 
discriminate a correct from an incorrect NLM when given the opportunity. This result is in line 
with other results (Smith & Marshall, 1976) showing that subjects are 90 per cent accurate in 
distinguishing their NLMs from those produced by other subjects. 

Given that a subject classified an item as OLD and thought that it was originally learned by 
mediation, three possibilities exist for classifying its NLM correctness. The item may have been 
an actual OLD mediated item, an OLD rote item, or a NEW item (FA). The mean confidence in 
NLM correctness for actual OLD mediated items was 4-32 (from above). The mean confidence 
in NLMs given to FAs thought learned by mediation (n = 440) was only 3-43, and the NLM 
corréctness confidence of actual OLD items that were correctly called OLD, but mistakenly 
judged as having been learned by mediation (n= 121), was only 3-47. Closed-loop theory predicts 
that these latter two confidence ratings should be low since there is no internal representation of 
the NLM to verify its correctness. 


Confidence in rotel mediation (learning strategy) decision 

The final set of comparisons gives evidence that subjects know the way that items were 
originally learned and that this information may be obtained through confidence ratings. What 
follows is the analysis of confidence in original learning strategy given correct item recognition 
only. Subjects would not have given such a rating for an item judged to be NEW. Table 3 
presents these data. Jt can be seen that 78 per cent of all mediated items were correctly 


Table 3. Mean confidence in learning strategy as a function of actual and perceived 
learning strategy 








Perceived status 


Actual status Mediated ^ Rote 





Mediated 
Mean confidence 4-37 3-73 
No. observations 1769 495 
Rote 
Mean confidence 3-69 3-85 
No. observations 147 638 
False alarm 
Mean confidence 3-74 3-53 
No. observations 506 716 








designated as having been learned by mediation. A t-test comparison yielded a significant 
difference in confidence of original learning strategy for mediated items correctly judged as 
having been mediated, and those incorrectly judged as having been learned by rote, t= 6-50, 

d.f. —35, P« 0-001. A similar comparison for original rote items yielded insignificance, P> 0-10. 
Moreover, correct NLM recall for mediated items judged learned by mediation was 83 per cent. 
It appears, therefore, that the confidence in choice of learning strategy of a correctly identified 
OLD item is facilitated greatly by the availability of a correct NLM at the time of testing for 
mediated items. Rote items lack this source of verification of learning strategy and, as a result, 
have a lower confidence rating for original learning strategy. False alarms judged learned by 
either mediation or rote had the expected low confidence ratings. False alarms should also be 
more likely to have been judged learned by rote since, by their very nature, representations of 
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original NLMs are not available. This is supported by the data, which show that more FAs were 
judged learned by rote than by mediation (756-510). 


Conclusion 


The intent of the present study was to explore the complexities of a retention process that, in 
the past, has been considered all too simple. The theoretical framework of the investigation 
centred on the role of verification and confirmation mechanisms as suggested by closed-loop 
theory. The data from the many analyses are pointedly internally consistent with the assumption 


of a verification process of the type discussed. 


Natural language mediators have been viewed as epiphenomenal (Underwood, 1972); not 
functionally related to retention performance. Having NLMs available at the time of retention 
testing, so the argument goes, does not necessarily mean that they have an important role in item 
verification. It might even be that NLMs are only reconstructed at the tinie of the retention test, 
given the presence of the stimulus item. The present study yielded data to contradict these 


arguments. 


Recognition may at times be viewed as a simple perceptual matching process, as in the case of 
items learned by rote. However, when verbal mediators are generated during acquisition, the 
complexity of the recognition act is greatly increased to the benefit of recognition efficiency. 
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Response times to stimuli of increasing complexity as a function of ageing 


T. C. Jordan and P. M. A. Rabbitt 


Twelve elderly and 12 young subjects were tested on a series of experiments with increasing complexity of 
perceptual-response mapping. As task complexity increased the differential slowing in performance between 
young and old increased and an agextask complexity interaction was observed. However, with practice this 
,phenomenon disappeared leaving an apparent age lag constant. This slowing was due to increased central 
processing time rather than peripheral factors. 

No major differences in strategies were observed between the groups, though the old subjects tended to be 
less able to extract critical (useful) features from the display. Stimulus repetitions of a new kind were found 
where all characteristics of the stimulus (relevant and irrelevant) were important. Repetitions of coding rules 
rather than of particular signals or responses also facilitated RT. It was also found that later in practice old 
subjects were making fewer errors than the young, reversing earlier observations. 


It is well established that human reaction time (RT) increases in old age. This may reflect 
changes in either central or peripheral processes. Early investigators tended to regard response 
mechanisms in the local musculature as being the primary cause for this slowing with increasing 
age. More recent accounts (Welford, 1958; Surwillo, 1968) stress the role of central processing 
functions such as the time taken to identify signals and to select appropriate responses to them. 

Smith, Chase & Smith (1973) postulate the existence of a series of four processing stages 
which allow distinctions to be made between stages that process information about the stimulus 
(the encoding and comparison stages) and stages that are primarily concerned with selecting and 
executing a response. Determination of the locus of the slowing of response speed with age 
requires a method for separating signal-processing time from time required to execute a 
response. ` 

One line of approach is provided by Birren & Botwinick (1955) and Botwinick, Brindley & 
Robbin (1958) who found that as discrimination difficulty was increased, RTs for old subjects 
increased more sharply than those of the young. Similarly, Kay (1955) and Simon (1967) found 
that when stimulus response allocation rules were made more difficult, older subjects showed 
relatively poorer performances. Other studies, however, as Tolin & Simon (1968) noted, have 
failed to support this agextask complexity interaction. They suggest that the presence or 
absence of such an interaction may depend upon the particular level of task complexity 
investigated. It would seem that further clarification would be valuable here, particularly if the 
locus of these effects could be isolated. Additionally, as Biederman (1972) has pointed out, the 
experimental study of choice reaction time and perceptual recognition has typically involved 
tasks where sources of information were either relevant or irrelevant. Outside the laboratory we 
are more commonly faced with situations where information which is relevant in one situation 
may not be in another (i.e. where information may be contingent upon the status of other 
sources of information). The functional significance of such contingencies would seem to 
lie in the economy they afford in responses,to the increases in complexity at information 
transformations between input and output. Whether old people are differentially affected by 
such features would have both theoretical and practical significance. 

There are two further factors which complicate age comparisons of this type. Most compari- 
sons between old and young people have been made when both groups are relatively unpractised 
at choice response tasks. They have also used relatively simple tasks, the ‘rules’ of which are 
quickly mastered and consequently experimental designs have been correspondingly short (Tolin 
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& Simon, 1968). It remains therefore an open question as to whether an agextask complexity 
interaction is still observed when subjects are highly practised, whether the effects of age 
disappear, or whether over all levels of task complexity old people are slower by an 'age 
constant'. Such findings would have important practical implications too. If older people are to 
be used in industry and, in many cases, retrained then one needs to know the effects of added 
task complexity on performance, the expected adjustment periods required, together with error 
rate patterns (a factor very often neglected in much of the literature). 

A second question is raised by investigations of the ‘repetition effect’ in young subjects. It has 
long been known that in serial choice response tasks, if a particular signal and the response to it 
is repeated, subjects respond faster than if a new response has to be made (Bertelson, 1961). 
Indeed it appears that one can further classify this phenomenon into signal repetition effects, 
response repetition effects and rule mediated effects (where neither the stimulus nor the response 
is repeated) (Rabbitt, 1968; Rabbitt & Vyas, 1973). Since the balance of stimulus and response 
effects change with practice, one cannot expect them to be independent, and if age decrements 
in RT were specifically related to differences in compatibility it may well be that age differences 
in response speed might be abolished for repeated responses but not for other responses. So far, 
results (Simon, 1968) have failed to provide a conclusive answer as to whether it takes longer to 
programme an incompatible response than it does to programme a compatible one. 

The following experiments therefore were undertaken to consider the effects of ageing on 
response times to stimuli of increasing complexity in serial choice RT tasks; to consider whether 
age differences were reduced or abolished on such tasks; and to examine repetition effects which 
involve, separately, repetitions of the whole or part of a complex signal, and of a particular 
motor response, and responses which involve repetition of a particular coding rule. An attempt 
was therefore made to discover if, and how, practice, compatibility and the repetition effect 
interact with age, and further to attempt to specify the locus of any such effects. 


Experiment I 
Subjects 
The subjects were elderly persons, 6 male and 6 female, with an average age of 69 years. All were healthy 
and maintaining a high level of community independence. ` 

Twelve young subjects, with a mean age of 20 years, were matched with the above by sex and by scores 
obtained on Raven's Standard Progressive Matrices and the Mill Hill Vocabulary Scale. Since Mill Hill 
scores change very little with age while Raven's scores do decline, matching on these criteria assures us that 
the older subjects when young were at least as intelligent as their cohorts. 


Apparatus 


A Stimulus Presentation and Reaction Time Apparatus (SPARTA) was programmed by punched tape to 
present sequences of 500 successive signals, one at a time, on a ‘Digitron’ display tube. Signals were either 
d cross or a bar (+ or —) on either a red, green or amber background. 

Subjects sat at a kevboard facing the display and responded to each stimulus presentation by pressing the 
required keys. The sequence was begun by pressihg one of the response keys and then proceeded as quickly 
and accurately as possible through the run of 500 presentations. A response on any key caused the display to 
change to the next state in the sequence within 40 msec. 

SPARTA automatically recorded every signal and the resultant response on punched tape. Results were 
analysed into repetitions and alternations of sequences and errors made. 


Procedure 
This was a two-choice task and subjects were instructed to press one key whenever a cross appeared, and 
the other key whenever a bar appeared. They were specifically told that the coloured backgrounds to the 
shapes were completely irrelevant to the task and should be ignored. 

A random sequence of twelve presentations was given as a practice before recording began. Instructions 
then were to proceed as quickly and accurately as possible and if a mistake was made not to correct it, but 
to ignore it and continue with the next response. 
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This being a relatively simple task it was felt that the 500 random, and equiprobable presentations of 
stimuli would provide sufficient opportunity to reach optimal performance without the interference of fatigue 
or the necessity of further trials. 


Results 


Errors were partialled out separately as were responses following an error. The latter were 
discarded as one could not be certain as to whether they were responses to the next stimulus or 
‘automatic’ error corrections. The remaining correct responses were analysed into a matrix of 36 
possible transitions between signal states. These transitions were then divided into four main 
classes. 

First a set of six identical transitions in which both the relevant signal (cross or bar) and the 
irrelevant background colour (red, green or amber) were repeated on successive occasions. The 
second class (of 12 possible cases) also consisted of repetitions of the relevant signal but with a 
change in the irrelevant background colour. Thirdly alternations were considered (six possible 
cases) where the relevant signal changed from a cross to a bar or vice versa, but where the 
background colour remained the same. Finally, the remaining 12 possibilities were alternations in 
both shape (relevant signal) and (irrelevant) colour background. Mean scores of individual 
subjects’ pooled data may be seen in Table 1. 


Table 1. Mean scores on four classes of transition states in two-choice task 








Young Old 
X (msec) s.p. X (msec) S.D 
Repetition of shape 
Colour same 459 140 532 149 
Colour alt. 496 134 556 144 
Alternation of shape 
Colour same 573 128 617 143 
Colour alt. 581 152 616 161 








As might be anticipated young subjects were significantly faster than their older counterparts 
in each of the response classes considered (P< 0-001). Equally unsurprising was the fact that for 
both young and old subjects repeated responses were significantly faster (P< 0-001) than 
alternated responses. A closer inspection of the rank order of the four transition classes does, 
however, reveal some interesting features. Within the two classes of repetitions when both the 
relevant and irrelevant components were repeated, the responses were faster than when the 
irrelevant component was not repeated. This alternation of the irrelevant component significantly 
(P<0-01) increased the processing time for the young subjects, the increase in response time for 
the old, however, was not significant. 

This pattern was less obvious when considering alternations. Although the young group as a 
whole were slightly faster when there was a common colour component, it seemed that for 
50 per cent of them at least a total alternation of signal appeared to facilitate a faster response. 
The difference though was not significant. This suggestion that when both components were 
alternated responses were faster was even more apparent with the older subjects, though 
again the differences between classes three and four were not significant. 

When errors were partialled out it was found that the old people made significantly (P< 0-01) 
fewer errors than the young. Total errors were, on the whole, few with an overall percentage of 
0-08 and 0-05 for the young and old subjects respectively. There was some slight indication of a 
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speed-error trade-off in both groups for the alternation responses. Total number of errors and 
the corresponding mean errors per classification state are shown in Table 2. 


Table 2. Mean number of errors for repetitions and alternations in Expts I, II, and III 








Young Old 
X Errors X Errors 
Expt. I 
Repetition of shape 
Colour same 7 4 
Colour alt. 6:3 45 
Alternation of shape 
Colour same 13-1 78 
Colour alt. 10-6 76 
Expt. II 
Repetitions 3-6 25-1 
Alternations 15-2 37:3 
Expt. II 
Block] Block 4 Block 1 Block 4 
Repetitions 7 5.8 23 2.5 
Alternations with 
common colour 15 73 25-5 4:8 
Alternations 20 13-9 28-2 8-8 
Discussion 


It seems that both groups of subjects responded more slowly when, although the relevant signal 
and response to it were repeated, the irrelevant background colour was changed. Previous work 
(Rabbitt, 1965) has suggested that old subjects seem less able to ignore irrelevant information 
than young subjects. One might accordingly expect that a change in the irrelevant background 
colour would be more disturbing for the old than for the young subjects. The present results 
would indicate that this is not the case. When the colour changed and the shape remained the 
same, mean RT increased by 37 msec for the young group and by 24 msec for the older group. 
One explanation might be that the older persons, despite the instructions to go as fast as they 
could, were slightly more cautious in their approach to the task; that is, if they were working 
more within their capacity than at their limits (as regards optimum speed of performance 
compatible with high levels of accuracy) then a change in background information would perhaps 
be less likely to disrupt their performance. Such a suggestion would seem to be supported by the 
fact that they made almost half the number of errors made by the young who, it should also be 
noted, showed a more marked speed-error trade-off especially in the alternation cases. 

Despite the relative simplicity of the task and the resultant well-practised state of the subjects 
neither young nor old were able to ignore the irrelevant information. Apparently the precise 
physical identity of successive displays affects the magnitude of the ‘repetition effect’. A similar 
finding was reported by Biederman (1972) using young subjects. He found that repetitions 
lengthened when an irrelevant dimension changed value in the absence of changes in the values 
of any other dimension. It would appear that both groups were responding not primarily to the 
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cross or bar, but rather to the display as a ‘gestalt’. The critical factor in obtaining a repetition 
effect appears to be the detection of change between one display and the next. If successive 
displays were identical, very rapid responses could be made. If there was any discrepancy, albeit 
in an irrelevant attribute of the display, some further analysis appears necessary before an 
appropriate response can be chosen. Such a discrimination between ‘change’ and ‘no change’ 
would also explain why a change in the irrelevant background colour apparently reduced RTs for 
some subjects for alternated responses. They were after all looking for a change, albeit one of 
shape, and the ‘totality’ of the change facilitated their response. 

This marked disjunction in the repetition effect for signals, especially in the case of the young 
subjects, is clearly an indication of a stage one effect (signal preprocessing and encoding) 
according to the Smith, Chase & Smith paradigm. Biederman has also postulated that such 
contingencies have their effect at some prior stage to response selection. 

A different way of regarding the results is to compare the overall magnitude of the repetition 
effect in old and young subjects. An initial hypothesis was that RT for repetitions (since these 
may be regarded as responses not slowed by difficulties of stimulus response mapping) would 
increase less with age than RTs for other responses. One sees from Table 1 that RTs for 
repetitions increased by an average of 66-5 msec between young and old groups, whereas RTs 
for alternated responses increased by 39-5 msec between the two groups. This interaction was 
significant (P< 0-05). This apparent discrepancy might be accounted for by the younger subjects’ 
tendency towards a greater speed-error trade-off in the alternation cases as noted earlier. 

Nevertheless, this was a simple task in which stimulus-response mapping rules were never 
complex. It seemed necessary to investigate a more complicated task in which S-R mapping 
rules were made more difficult while the set of signal states to be discriminated was kept the 
same. Expt. II was designed to do this. 


Experiment II 
Subjects 


The same matched subjects were recalled five weeks later for this experiment. 


Apparatus 
SPARTA was again used, and a new tape of 500 stimulus signals produced. 


Procedure 


Subjects were instructed that this was a similar experiment in that they were to respond as quickly as 
possible while keeping errors to a minimum. Similarly any errors made were to be ignored and not corrected. 
This time, though, instead of simply regarding the shape (+ or —) they were to consider the coloured 
background too. If a red display (+ or —) occurred they were to press the right key. If a green display (+ or 
—) then they should press the left key. If an amber display occurred then they were to follow the original 
shape rule of (amber) + to the right and (amber) — to the left, thus making the task a four-choice one (see 
Fig. 1 for a schema of stimulus response mapping.) Allocation of responses (fingers) to displays were 
balanced across subjects in the event that R— Right might be considered an easier location than 

Green — Left. 

Retaining the use of only two keys had the advantage of later being able to consider cases where not only 
did the signal change requiring a motor alternation (e.g. from Green + or — to Red + or —, or from Amber 
— to Red + or —, etc.) but also where the signal changed but the response key did not. That is, one had a 
motor repetition required for a response where a signal alternation had occurred (e.g. Amber + to Red —, 
etc.). 

A practice period to allow familiarity with the rules was given before recording began. 
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Expt. I 
Stimulus 


display - or + 
(coloured background to be ignored) 
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Expt. II 
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Response 
key* L] - 
Expt. III 
Stimulus ed 
display Red+ Red— Green+ Green— Amber+ Amber— 


TERRE 
ES ESI E 


* Balanced across subjects 
Figure 1. Schematic diagram of stimulus-response mapping in Expts I, II and III. 


Results 


RTs were obtained for each of the 36 possible transition states as in Expt. I. The classes of 
interest this time were of course more numerous and complex. A summary of group mean scores 
and standard deviations can be seen in Table 3. 

Transitions between identical displays were faster than between other conditions. With the 
increased complexity of the overall task, even in the repetition class, both groups had a higher 
base-line response time than in Expt. I. The older subjects showed a slight increase in the 
disparity between their times and those of the young control group compared with Expt. I. 

Within the three classes of repetition one again finds that when both the relevant and irrelevant 
components were repeated the responses were significantly faster (P< 0-01) than when the 
irrelevant component was not repeated. If one compares the difference between groups from 
class 2 to class 3, then the old subjects appear to be more affected by this feature. However, in 
comparing the total increases in response times for both groups from their fastest time (506 and 
601 msec respectively) to the mean times observed when the irrelevant component is changed 
(624 and 728 msec) then one finds the difference between the groups has not significantly altered. 
They are still some 100 msec apart. In the cases of repeated identical stimuli (colour and shape) 
there is a suggestion for both groups that colour is perhaps a better mediator than shape for 
producing a fast response. 

Combining the three main classes of alternations together, or taking each as a separate class, 
there is a significant (P< 0-01) difference between these cases and the repetition of a coding rule. 
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Table 3. Mean scores on six main classes of transition states in four-choice task 











Young Old 
X (msec) S.D. X (msec) S.D. 
Repetitions 
Colour same (shape same) 506 375 601 334 
Shape same (colour same) 565 336 635 325 
Colour same (shape alt.) 624 335 728 424 
Alternations 
With common colour component 711 386 841 472 
With common shape component 747 Mk us 465 820 Yn E 444 
comprised of: 
colour after colour rule (shape same) 7/04 731 
colour after shape rule (shape same) 733 847 
shape after colour rule (shape same) 805 833 
With colour and shape allt. 748 S 2 429 815 MA zd 493 
comprised of: 
colour after colour rule (shape alt.) 715 758 
colour after shape rule (shape alt.) 744 799 
shape after colour rule (shape alt.) 787 886 


MR = Motor repetition. MA = Motor alternation. 


However, within this large group of alternations a number of less obvious features are of 
interest. 

Colour again appears to be a better mediator for response times than does shape in regard to 
coding rules. This is clearly shown in the cases of total alternation and where there is a common 
component by the consistent ordering of the colour after colour, colour after shape and finally 
shape after colour rules, for both groups. With the alternating shape rule (when there is a 
common colour component) the old subjects seem to suffer greater distraction. They appear less 
able to extract the critical feature. Indeed, there is evidence throughout the results that when a 
disjunction has to be made between either shape or colour coding, the task is made more difficult 
if the second of the two displays contains an element of the first. The old subjects again 
appeared to be greatly assisted here by a total disjunction of both colour and shape. In fact the 
difference between their response times and those of the young group at this point was only 
67 msec. 

It is interesting at this juncture to note that in taking the differences between the fastest and 
slowest times overall of both young and old groups, both show a total difference of 240 msec. 
Thus, although there are individual variations between the groups in their ability to abstract 
assistance from different coding rules, the overall performance of the old subjects does not 
appear to have been hindered to any significantly greater degree by the added complexity of the 
task compared with that of the young group. 

It seems that it is the different coding rules which appear to be the overriding factor in any 
repetition effect observed, rather than the repetition of a physical motor act. In the breakdown 
of the subclasses within the alternation group where there is a common shape component motor 
repetitions are conistently faster than motor alternations. However, when a total alternation of 
signals occurs motor repetitions do not in any way appear to facilitate a response; indeed, they 
are markedly slower by some 100 msec for both young and old groups. Here a total alternation 
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does not assist the identification of a response (as in Expt. I) but merely indicates that the 
required response must be chosen from one of the remaining categories. 

With the added complexity of the task, error patterns between the two groups did show a 
significant difference (P< 0-01). The mean number of errors for the young group in each of the 
following categories: repetition, repetition with shape alternating and all alternations was 3:6, 
6-25 and 15-15 respectively. For the old subjects the respective means were 25-1, 28:5 and 37:3 
(see Table 2). However, the distribution of errors was not apparently random as in Expt. I and it 
appeared that the older subjects made most of their errors early in the sequence, and as they 
became more practised at the task their errors decreased rapidly. Consequently it was decided in 
Expt. III not only to add a further dimension of complexity but to also provide additional blocks 
of trials in order to allow sufficient time for practice and to observe patterns of learning curves 
and errors over a longer period of time. 


Discussion 


The results of this experiment support and extend recent findings reported by Rabbitt (1968) and 
Rabbitt & Vyas (1973) on young subjects. They found that repetition of classes of signals can be 
mediated by the use of coding rules which define S-R sets. That is, when a set of complex 
stimuli can be grouped in terms of stored components (colour or shape in these experiments) 
then this grouping may be used as a mnemonic code to recall the appropriate signal-response 
pairing. Subjects may be assisting themselves by remembering that ‘this is a red and amber + 
finger, and this is a green and amber — finger’. It is therefore not just the repetition of a motor 
response, but the repetition of a perceptual-coding rule which may facilitate RT. This apparently 
applies equally strongly to the old as well as the young subjects and is quite a robust 
phenomenon. 

There is the suggestion, however, that this observation may only apply in the early learning 
stages of such a complex task. That is, as the subject becomes more practised aspects of the 
effect may ‘wash out’. Support for this hypothesis may also be indicated from the distribution of 
errors observed in this experiment. Perhaps the initial strategies used in learning the task and 
later strategies, developed with practice during a single sequence of 500 presentations, are 
confounded to give mixed effects. 

In order to determine if any transition could be observed between these strategies, and indeed 
whether old people might differ from the young in the rate at which they change from one 
strategy to another, it was decided to use all six possible display states in signal-response 
pairings in the following experiment. Additional time would obviously be needed to reach 
optimal performances under these conditions. This was done by testing in four successive blocks 
of trials. 

Finally, it should be pointed out that there was nothing in this data to suggest that the slowing 
on the part of the older subjects might be considered to result from impaired psychomotor rather 
than decision-making efficiency as has been recently reported by Waugh, Fozard, Talland & 
Erwin (1973). However, time between trials in their experiment was approximately 10-15 sec and 
was consequently quite a different task. 


Experiment III 
Subjects 
The same matched subjects were recalled six weeks later for this experiment. 


Apparatus 


SPARTA was again used, and a set of new tapes providing four blocks of 500 random and equiprobable 
signals were produced. 
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Procedure 
Subjects received the same instructions as regards general performance. In this experiment, though, they 
were to regard each stimulus presentation as a separate state; in other words this was a six-choice task. A 
separate response key was used for each of the six possible responses and subjects used the first, second 
and third fingers of their right and left hands. Allocations of responses (fingers) to displays were balanced 
across subjects so that transitions between successive responses would not reflect either the relative ease 
with which different sequences of finger movements might be made, or any possible idiosyncracies of coding 
compatibility (see Fig. 1). 

After the initial instructions and explanations, subjects were allowed a practice period to determine 
whether they had in fact mastered ‘the rules’. This being possibly rather difficult, especially for older 
subjects, the six transition states were first given in sequence before being randomized in the practice trials. 
Four blocks of 500 signals were presented with a 3 min rest period between each. 


Results 


Each of the four blocks of 500 presentations was analysed into the 36 possible transitions 
between successive displays and responses. For each subject the six cases of identical transi- 
tions were abstracted and pooled to provide mean RTs and standard deviations. Similarly the six 
cases in which the colour of successive displays was repeated (but not the shape) were also 
separated. The remaining transitions contained 12 cases in which the shape component remained 
common and 12 in which there was a complete alternation of shape and colour. The results are 
set out in Table 4 for blocks 1 and 4. 


Table 4. Mean scores on four classes of transition states in six-choice task 


X (msec) 
Young Old 
Block 1 Block 4 Block 1 Block 4 
Repetitions 609 582 836 709 
Alternations with 
common colour 844 718 1240 970 
Alternations with 
common shape 1060 927 1371 1075 
Alternations 1145 970 1473 1131 


A consistent pattern was found across subjects and conditions and a three-factor (age trials 
X task transition conditions) analysis of variance revealed significant differences between groups 
and between transition classes (P « 0-001). A significant interaction between age and task 
complexity was observed in block 1 but not in block 4. Separate analyses for both groups 
revealed that RTs for identical transitions were significantly faster (P « 0-001) than any for 
alternated signals and responses. Alternations between displays with a common colour 
component were also significantly faster than alternations between displays with a common 
shape component (and total alternations) P « 0-01, though the latter did not differ significantly 
from total alternations at this level. 

Combining the last two categories of alternations Fig. 2 shows the slopes for RT against 
transition difficulty for both groups in the three classes of repetition, alternation with common 
colour component and all other alternations separated for blocks 1 and 4 (i.e. early and late in 
practice). 
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Figure 2. Slopes for RT transition difficulty in blocks 1 and 4 for repetitions, alternations with common 
colour and alternations. 


It can be seen that the older subjects were considerably more slowed by the increased 
complexity between transitions within the task during early practice, but when one considers 
overall performance on the final block of trials differences between the young and old groups are 
fairly constant (P< 0-02). This becomes even more clear when one considers the learning curves 
over the four blocks of trials (see Fig. 3). The old subjects were obviously slowed more than the 
young by the greater complexity of this task, but correspondingly they showed much more 
learning than their younger counterparts. 

Error patterns also revealed some interesting points (see Table 2). The old subjects made many 
more errors, even on repetitions, than did the young on block 1, but during blocks 3 and 4 they 
consistently made fewer errors than the young subjects (P<0-01 in both comparisons). 


Discussion 


The fact that repetitions of identical signal states produced faster response times than any other 
transitions requires no comment. We may assess the overell magnitude of the repetition effect by 
taking the total difference score (slowest alternations—repetitions) for both groups. From Table 4 
we see that in block 1 this was 536 msec for the young subjects and 637 msec for old subjects. 
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Figure 3. Learning curves over four blocks for repetitions, alternations with common colour and alternations. 


In block 4 this was 388 and 422 msec respectively. It thus appears that when subjects were 
unpractised, the magnitude of the repetition effect in a complex task is greater for the old than 
for the young. However, when subjects are moderately practised it again seems that age adds a 
‘constant’ to all response times, irrespective of the difficulty of the transition involved. 

It is also clear that both young and old subjects show facilitation of response times when a 
particular coding ‘rule’ is employed on immediately sequent trials though neither signal nor 
response are repeated. That is, when subjects have to consider both colour and shape, the 
repetition of either one of these features apparently facilitates performance. However, as in 
Expt. II, the repetition of such a coding rule seems to favour the old considerably less than 
the young. Previous work (Rabbitt & Vyas, 1973) has suggested that as subjects become 
increasingly practised at a task, repetition of coding rules becomes less important. A relevant 
comparison therefore in this experiment is between alternations with a common colour 
component (the fastest alternations) and alternations with no common components. During 
block 1 of practice the differences for the young and old subjects were 301 and 233 msec. By 
block 4 the respective differences were 192 and 161 msec. It would seem that for both young 
and old subjects such a mediating coding rule is indeed less valuable. 
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The slowest RTs were found when there was a total change in both the colour and shape 
components of the signal. Again the total change in signal did not facilitate subjects' responses 
since it did not signify a simple motor alternation as previously suggested in Expt. I, but 
merely indicated that one of the five remaining responses was required. Further processing of 
information was therefore needed, and the locus of the resulting decrements in RT might 
reasonably be considered as being central phenomena. 

By allocating the additional time needed for performances to approach stability into four 
blocks, it was found that the suggested differences in error patterns in Expt. II were indeed 
borne out by results in this study (see Table 2). Indeed, the differences between the groups are 
somewhat unexpected compared with previous data reporting that old people are relatively 
inaccurate in choice RT tasks. True, their errors were more numerous on the first trials, but by 
the last two blocks they were well below the error rates of the young subjects. This, together 
with the large and quite rapid amounts of learning on the part of the old subjects must make one 
a little unsure of the absolute values set to ageing decrements in much of the previous literature. 
It certainly appears true that as the complexity of a task increases the initial response times of 
old people are greatly increased relative to those of the young. However, the more they are 
practised, the more the interaction between age lag and task difficulty disappears. Eventually it 
seems that one observes a fixed lag which may be due to a fixed central nervous system lag for 
all tasks. Old people are slower, but in other respects it seems that they initially behave like 
unpractised young people rather than showing complex differences in strategy. 

Although minor differences in the ability to make use of common features of signals occurred 
between young and old groups, there seemed to be no obvious differences in overall strategies 
used by either group as regards the perceptual. processing of information. The present results 
cannot, of course, be generalized to all possible tasks. It must be remembered that this was a 
limited study in which only single simple motor responses were required rather than complex 
programmed sequences of responses. The latter aspect will need to be more thoroughly 
investigated with old people, before any strong claims of a fixed lag with ageing can be made. It 
would seem though that from a practical point of view old people might not be at quite such a 
disadvantage as has previously been thought. They are slower, there is no doubt; but if this 
general central lag is taken into account, and if one can find suitable mediating components in 
complex tasks, then provided sufficient practice time is allowed, initial learning periods might not 
be unduly long or frustrating. These results are therefore encouraging for those concerned with 
the training of elderly subjects to perform quite complex perceptual motor skills. 
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The measurement of imagery vividness: Normative data and their 
relationship to sex, age, and modality differences 


K. D. White, R. Ashton and R. M. D. Brown 





Over 2000 undergraduate psychology students completed Sheehan's (1967 a) revision of Betts’ (1909) 
Questionnaire Upon Mental Imagery. Females and older subjects reported significantly more vivid imagery 
than males or younger subjects. When modalities were ranked in order of evoked imagery vividness the 
pattern did not agree with previous work and it was suggested that rankings may be test specific rather than 
universal. Additional data were presented on the test's reliability, and the first set of norms derived taking 
into account the significant sex and age differences. 





One of the most widely used self-report measures of imagery vividness is Sheehan's (1967 a) 
revision of Betts' (1909) Questionnaire Upon Mental Imagery (QMI). To date, however, there 
has been no systematic investigation of empirical issues relating to sex, age and modality 
differences; figures relating to the reliability of the QMI are scant, and the ability to make 
comparisons between individuals' scores by reference to normative data, is absent. This paper 
aims to overcome these deficiencies. 


Sex differences 


Whether males or females have the better imagery is a matter of some dispute. This issue may 
interact with other important parameters such as age and imagery modality and is further 
complicated by the type of measure used — either objective or subjective, and the specific 
parameter investigated — colour, form or vividness, as well as imagery control. 

One of the earliest recorded observations was that of Galton (1883), who noted that 'the 
power of visualizing is higher in the female sex than in the male' (p. 69). His view has received 
support from later workers who have reported that after photic stimulation females report more 
colour imagery than males (P. A. Marks, 1962; Palmer & Field, 1968), and the better imaginal 
processes of females have been used in explanation of their superiority in free and incidental 
learning (Sheehan, 1971), and in the recall of picture detail (D. F. Marks, 1973). Michael (1967) 
found that visual imagery was stronger in females; Sheehan (1967 a) noted that females reported 
ihore vivid imagery, and in Griffitts' (1927) experiments women reported more auditory imagery 
than men. The superiority of males (at least at younger ages) in visual imagery has been reported 
by Christiansen (1969), and P. A. Marks (1962) found that male subjects gave a greater variety of 
form imagery than did females. Finally, three studies have found that sex differences are 
unimportant. Davis (1932) used a variety of objective and subjective measures; Di Vesta, 
Ingersoll & Sunshine (1971) examined total QMI scores, and Hiscock & Cohen (1973) used only 
the visual and auditory scales of the QMI. 


Age differences 


There is little information with respect to the effect of age on imagery. While the need to control 
for effects such as age has been recognized (Sheehan, 1972), the only clear statement comes 
again from Galton (1883), who observed, ‘after maturity the further advance of age does not 
seem to dim the faculty but rather the reverse’ (p. 69). 
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Modality differences 


The relative strength of imagery in various modalities has not been systematically investigated, 
although attempts have been made to relate modality differences to imagery types (Stern, 1938; 
Diehl & England, 1958), and to physiological functioning (Golla & Antonovitch, 1929; Short, 
1953). The general inability of investigators to devise reliable typologies has led to a decline in 
such attempts although the use of more sophisticated statistical techniques may yet provide an 
ideational Rosetta stone (White & Ashton, 1975). Whether or not one subscribes to a typology 
based on a dominant sense, or a combination of senses, there may be benefit in examining 
separate modalities. This procedure may enable experimenters to more efficiently pre-select 
individuals on the basis of relevant criteria, e.g. clear cutaneous imagery may relate to 
handedness, and vivid gustatory imagery could be important in salivation experiments. It may 
well be that modalities other than the two traditionally used (visual and auditory) are equally, if 
not more, important. In this connection Lindauer (1969) examined over 200 words in terms of 
the ease and vividness of their evoked imagery. He found that most imagery was evoked by the 
cutaneous and gustatory modalities; visual and olfactory were next, and auditory words evoked 
the least. 


Reliability 


So far as the QMI is concerned, reliability data are scant. Relevant figures are presented in Table 
3, and are derived from three sources: data which are incomplete (Sheehan, 1967 b), complete 
figures in which male and female scores are combined, and thus potentially confounded (Evans 
& Kamemoto, 1973), and finally coefficients obtained in the present study. 


Method 

Subjects 

The sample was the majority of first-year psychology students for 1972, 1973 and 1974 at the University of 
Queensland. In the main study there were 2213 students: 1385 females whose ages ranged from 16 to 56 
(X= 20-4, s.p. = 5-9) and 829 males who ranged in age from 16 to 51 (X=22:3, s.D. = 6-3). Distributions by 
age and sex are set out in Table 1. 


Table 1. Distributions of the sample by age and sex 


Age range Male Female 
16-17 144 462 
18 132 359 
19-20 152 232 
21-25 241 154 
26+ 160 177 
Overall 829 1385 


Note. The differences in numbers of males and females reflect sex differences in enrolments. 


Test description 


The shortened QMI measures vividness of mental imagery in seven sensory modalities — visual, auditory, 
cutaneous, kinaesthetic, gustatory, olfactory, and organic. Questions within the test are arranged in seven 
blocks of five, and require mental images, which are aroused by each question, to be rated on a seven-point 
scale from 'Perfectly clear and vivid' (1), to *No image at all' (7); low scores thus represent the most vivid 
imagery. The scale has been reproduced in books by Richardson (1969) and Hilgard (1970), and, together 
with the original analysis, is available from the Photo Duplication Service in the Library of Congress (see 
Sheehan, 1967 a). 
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Procedure 
Norms. The questionnaire was routinely administered to students during class time. It was the first of a small 
battery and was given in the third week of term when students were ‘test-unsophisticated ’. 


Reliability. The same test was readministered one year later to the 251 students who comprised the reliability 
sample. Testing was again early in the first term, and in a class setting. 


Results and discussion 


Means and standard deviatious used in making comparisons are presented in Part A of Table 2, 
and the results of the significance tests are given in Part B. 


Table 2. Imagery vividness scores: Means, standard deviations and t values for all groups 














Modality 
Vis. Aud. Cut. Kin. Gus. Olf. Org. Total 
(A) Groups 
Male 
Overall X 12:6 13-0 14-4 12-6 142 15-8 12-8 95.4 
S.D. 4-2 4-8 5-1 5-7 5-2 58 4:6 24-8 
Young X 12:7 13-4 14-6 12:8 14-4 16-2 12:8 96-8 
S.D. 4-2 4-6 5-0 4-7 5-2 5-9 4-6 24:3 
Od X 12-4 11-8 13-6 11-8 13-4 14-5 12-6 90-2 
S.D. 42 5.] 5-6 4-6 5-1 5-4 4-3 25-1 
Female 
Overall X 114 13-4 12-6 12:2 13-7 14-9 11-9 90-2 
S.D. 3.9 4-7 4:6 45 5-3 5.8 4:6 24-0 
Young X 11-4 13-5 12-6 12-2 13-6 15-1 11-8 90-3 
S.D. 3-6 4-5 4-6 4-4 52 5:7 4-5 22:8 
Od X 10-9 12-4 12-0 11-7 13-1 13-2 12-7 86-3 
S.D. 54 5-7 5-4 5-4 5.5 6:3 5:6 32-3 
(B) Comparisons 
Male-female 
Overall t 67** 1-9 83** 2.0* 2.2* 3.5** 4.4** 4-g** 
Young t 6.8** 0-5 8.5** 2.7** 3.2** 3.9** 4.6** 5-7** 
Old t 2-9* 1-0 22* 0-2 0-5 2-0* 0-2 1-2 
Old-young 
Male t 0-4 2-3* 2-1* 1-1 2-2" 3.5** 0-5 3-0** 
Female t | 12 2.5* 1-4 1:2 1-2 3.8** 2.1* 1-8 


———————————————————————————— MM M MH LM ÉMÉHMHM—————— 
Note. For each modality, scores range from 5 to 35; for the total, the range is 35 to 245. Throughout 


Part A, lower mean scores represent more vivid imagery. 
* P«0-05; ** P«Q-0l. 


Age and sex differences 


The first analysis examined sex differences. As far as overall scores are concerned it is apparent 
that females report more vivid imagery than males. Differences were statistically significant for 
all modalities but auditory, where males were marginally superior. 

In the course of examining age levels it was found that although more vivid imagery was 
reported with increasing age, only comparisons involving the oldest age groups were significant. 
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For this reason the *old' male-female comparison involved students who were 26 or older; 
*young' male-female comparison was based on students aged 16 to 25. Again there were sex 
differences. Older females reported significantly more vivid visual, cutaneous, and olfactory 
imagery while for younger female students results parallel exactly those obtained for the overall 
data. Present findings, that females have more vivid imagery than males at all age levels, thus 
offer strong support to previous work which has demonstrated female superiority in a wide 
variety of imagery tasks. 

Age effects were examined using the dichotomy previously described. It was found that older 
males reported significantly more vivid auditory, cutaneous, olfactory, and total imagery, 
whereas older females reported significantly more vivid auditory and olfactory imagery. The only 
reversal occurred in the organic modality where younger females were superior. The reason for 
this is not readily apparent; perhaps items in this section of the QMI are difficult for older 
persons to vividly imagine. Nevertheless our results support Galton's (1883) observation 
regarding imagery and age and, more importantly, as far as the present study is concerned, 
highlight the importance of age groupings for the presentation of normative data. 

While examining age differences we were fortunate in being able to obtain test scores on a 
small sample of elderly females. The sample consisted of 13 females who ranged in age from 68 
to 82, with a mean of 7214, and were drawn from a larger group associated with a research 
project examining the effects of ageing. Their mean scores for the seven modalities were: 8:8, 
9-2, 9-4, 10-5, 8:4, 12:2, and 13-5. When these are compared with scores from the oldest female 
group, set out in the last two lines of Part A in Table 2, it can be seen that in all but one 
modality (organic), mean scores indicate more vivid imagery. Differences were significant in the 
visual (f= 2-2, d.f. = 198, P<0-05), auditory (t=3-0, d.f. = 198, P< 0-05), cutaneous (t= 2:5, 
d.f. = 198, P< 0-05), and gustatory (t= 4-6, d.f. = 198, P<0-01) modalities. These figures lend 
additional support to present findings and Galton's comment. 


Reliability 

Presented in Table 3 are available figures on the reliability of the Betts' QMI. They display the 
following features. There is a tendency for reliability coefficients to decrease in size as the 
test-retest interval increases. The only exception is in the visual modality where a seven-month 
interval resulted in a larger reliability than a six-week interval. 

Present data confirm the test's reliability and suggest that male students respond more 
consistently than female students as in all but one modality, kinaesthetic, male coefficients are 
larger. A Fisher exact probability test (Siegel, 1956) revealed this sex difference to be significant 
beyond the 0-05 level. 

Comparisons between first and second occasion mean scores showed that on all but the visual 
modality, second occasion means were higher — indicating less vivid imagery. None of the 
differences, however, were significant. The reliabililty of individual questions was also obtained. 
Data ranged from 0-22 (question 20) to 0-56 (question 27) for total scores; from 0-14 (question 18) 
to 0-62 (question 27) for males, and from 0-19 (question 19) to 0-54 (question 27) for females. 
These data confirm coefficients reported in Table 3, as questions 18, 19, and 20 are from the 
modality with the lowest reliability, kinaesthetic, and question 27 is from the olfactory modality 
which consistently displays high reliabilities. 


Modality differences 


The first issue investigated was the extent of intermodality differences for males and females. 
For females all 21 comparisons were significant, one at the 5 per cent level, the remainder at the 
1 per cent level. For males, although 16 comparisons were significant - two at the 5 per cent 
level, and 14 at the 1 per cent level - there were five non-significant differences: 
visual/kinaesthetic, kinaesthetic/organic, visual/organic, organic/auditory, and 
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Table 3. Reliability data for the Betts’ QMI 

















Investigation ` 
Evans & Present study 
Sheehan Kamemoto 
Modality (1967 b) (1973) Male Female Overall 
Visual 0-78 0-67 0-58 0-42 0-52 
Auditory — 0-74 0-49 0-44 0-46 
Cutaneous — 0-82 0-53 0-42 0-51 
Kinaesthetic — 0-74 0-29 0:32 0-32 
Gustatory — 0-75 0-51 0-42 0-46 
Olfactory — 0-72 0-60 0-58 0-59 
Organic — 0-61 0-49 0-48 0-51 
Total 0-78 0-91 0-63 0-54 0-59 
= 62 35 89 162 251 
Subjects ... American American Australian students 
students, students, 
male male and 
female 
Test/retest X' months 6 weeks 12 months 


interval 








Note. All coefficients are significant at P< 0-01 level. 


gustatory/cutaneous. This is the first occasion on which significant differences between QMI 
modalities have been reported. In the past, experimenters have used summed modality scores of 
either the most important modalities, e.g. visual and auditory, or of the total. While the 
existence of a general imagery factor (Sheehan, 1967a; White, Ashton & Law, 1974) may 
support the latter convention, there is the strong possibility that the use of individual modality 
scores may improve predictive efficiency. 

The second issue was whether there were distinctive male-female differences in modality 
strengths. When ranked modality scores were compared, the two orders, while not identical, 
were significantly correlated (rho = 0-86, d.f. — 7, P< 0-02). From most to least vivid imagery the 
orders were: males — visual, kinaesthetic, organic, auditory, gustatory, cutaneous and olfactory; 
females — visual, organic, kinaesthetic, cutaneous, auditory, gustatory and olfactory. Thus, for 
both sexes, visual items yielded the most vivid imagery and olfactory items yielded the least 
vivid. In so far as the auditory modality is concerned, present findings support Lindauer's (1969, 
1972) views that traditional usage need not necessarily be the most appropriate. The failure of 
present results to agree with Lindauer's (1969) ranking suggests that his words and present QMI 
phrases may have produced task-specific rankings. Additional support for a specificity argument 
comes from Brower (1947), who reported the order visual, auditory, tactual, tactuo-kinaesthetic, 
thermal and olfactory, for a single imaginal event: onions being fried. Secondly, present results 
suggest that the traditional belief in imagery typologies, either for males, females or individuals, 
merits further investigation. 

The final and perhaps most important purpose of this paper was to provide normative data 
on that group of human subjects most widely used in psychological research: undergraduate 
psychology students. Because significant age and sex differences were obtained, 
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separate tables were derived for young males, young females, old males, and old females. These 
are presented as Tables A, B, C, and D in the Appendix. 


Implications 

The data presented above have many implications for both future work and the interpretation of 
past studies which involve subjects' imagery ability as an experimental variable. Perhaps the 
most important aspect is that women were found to manifest, or at least report (see White et al. 
1974 for a discussion of imagery manifestation versus reporting as a complicating factor in 
interpreting imagery questionnaire scores), significantly more intense imagery than men. This 
immediately suggests that future reports on the effects of rated imagery abilities on other tasks 
(e.g. memory tasks; Marks, 1973; Gur & Hilgard, 1975) should give not only subject sex 
information, but also data concerning the distribution of the imagery scores of the two sexes and 
an analysis of sex effects on the dependent variables. For example, Gur & Hilgard (1975) studied 
10 male and 10 female subjects and found, in their total sample of 20, that they had 10 poor 
imagers and 10 good imagers, as assessed by D. F. Marks' (1973) Vividness of Visual Imagery 
Questionnaire. Our normative data would suggest that the majority of the 10 good imagers would 
be female and vice versa. If this was the case, and unfortunately we have no way of knowing 
this from Gur & Hilgard’s paper, then an important question of interpretation arises - namely, 
were the effects these authors reported due to imagery vividness differences or due to a sex 
difference of some, at present unknown, kind or due to an interaction between sex and rated 
imagery vividness? 

The present normative data would also suggest that if experimenters wish to demonstrate the 
efficacy of imagery instructions (Paivio, 1971) on cognitive activity then the maximal disparity 
between conditions should be obtained if older women were used in the imagery instruction 
group and younger men in the non-imagery instruction group. This prediction would be easy to 
test experimentally. 

On a more applied note, Schwartz (1973) presented some, admittedly rather anecdotal, 
evidence that patients suffering various psychosomatic illnesses found relief by imagining certain 
situations. For example, he cites a patient suffering from Raynaud's disease who could relieve 
his symptom of cold feet by thinking *hot thoughts' (Schwartz, 1973, p. 671), such as lying on a 
beach. Our data suggest that older women would respond best to treatment of this kind, and that 


younger men would benefit from it only after they had been trained in ‘imaginal skills’ (see 


Phillips, 1971). 
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Appendix 
Normative data on the QMI for psychology students 


Table A. Males aged 16-25 (n = 699) 











Modality 
Sten Vis. Aud. Cut. Kin. Gus. Oif. Org. Total 
1 57 5-6 5-8 5-6 5-7 5-8 5-6 35-64 
2 8 7-8 9 7-8 8-9 9-10 7-8 65-74 
3 9 9-10 10-11 9 10 11-12 9 75-79 
4 10 11 12 10 11-12 13 10 80-84 
5 11 12 13 11 13 14-15 1 85-94 
6 12 13 14 12 14 16 12 95-99 
7 13 14-15 15-16 13-14 15-16 17 13-14 100-104 
8 14-15 16 17 15 17 18-20 15 105-114 
9 16-17 17-18 18-20 16-18 18-20 21-23 16-18 115-129 
10 18+ 19+ 21+ 19+ 21+ 24+ 19+ 130+ 
Mean 12:7 134 14-6 12-8 14-4 16-2 12-8 96-8 
$.D 42 4-6 5-0 4-7 5-2 5-9 4-6 243 








Table B. Males aged 26 plus (n = 160) 














Modality 
Sten Vis. Aud. Cut. Kin. Gus. olf. Org. Total 
1 5-6 5 5-6 5 5-6 5-6 5-6 35-54 
2 7-8 6 7-8 6-7 7 7-8 7-8 55-69 
3 9 7-8 9-10 8-9 9 9-11 9 70-74 
4 10 9-10 11 10 10-11 12 10 75-79 
5 11 11 12 11 12 13 11 80-84 
6 12 12 13 12 13 14-15 12 85-89 
7 13 13 14 13 14-15 16-17 13-14 90-94 
8 14-15 14 15-17 14 16 18 15 95-104 
9 16-17 15-17 18-20 15-16 17-20 19-20 16-17 105-114 
10 184- 184- 21+ 17+ 21+ 21+ 18+ 115+ 
Mean 12-4 11:8 13-6 11:8 13-4 14-5 12-6 90-2 
S.D 42 54 5-6 4-6 51 54 43 25-1 


eee 
TMM——— —————————LLu ÉL 
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Table C. Females aged 16 to 25 (n — 1208) 
TU UULHH —————MM———————————————————— 





Modality 
Sten Vis. Aud. Cut. Kin. Gus. Olf. Org. Total 
a en 
1 5-6 5-7 5-6 5-6 5-6 5-7 5 35-59 
2 7 8-9 7-8 7 7-8 8-9 6-7 60—69 
3 8 10 9 8-9 9-10 10-11 8 70-74 
4 9 11 10 10 1] 12 9 75-79 
5 10 12 11 11 12 13-14 10 80-84 
6 1l 13 n 12 13 15 11 85-89 
7 12 14 13 13 14-15 16-17 12 90-99 
8 13 15-16 14-15 14-15 16-17 18-19 13-14 100-104 
9 14-15 17-18 16-18 16-17 18-20 20-22 15-16 105-114 
10 16+ 19+ 19+ 18+ 21+ 23+ 17+ 115+ 
Mean 11-4 13-5 12-6 12:2 13-6 15-1 11:8 90-3 
S.D 3-6 45 4:6 44 52 57 45 22:8 





Modality 
Sten Vis. Aud. Cut. Kin. Gus. olf. Org. Total 
a ee 
1 5 5 5 5 5-6 5 5 35-49 
2 6 6 6 6 73 6-7 6-7 50-59 
3 7 7-8 7-8 7-8 9 8 8 60-64 
4 8 9 9 9 10-11 9 9-10 65-74 
5 9 10 10 10 12 10-11 11 75-79 
6 10 11-12 11-12 11 13 12 12 80-84 
7 11 13-14 13 12 14 13-14 13-14 85-89 
8 12 15 14-15 13 15 15-17 15 90-99 
9 13-14 16-17 16-17 14-17 16-19 18-20 16-18 100-114 
10 15+ 18+ 18+ 18+ 20+ 21+ 19+ 115+ 
Mean 10-9 12:4 12:0 11-7 13-1 13:2 12-7 86-3 
S.D 54 5:7 5-4 5-4 5.5 6:3 5:6 323 


———— M —— —— —————————————————— 
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The role of visual imagery in visual, tactual and cross-modal matching 


Ed Cairns and Peter Coll 


Piaget & Inhelder (1956) suggested visual imagery as a possible mechanism involved in children's 
cross-modal, particularly tactual-visual perception. Abravanel (1973) has reported introspective 
reports from adult subjects indicating 'that visual images accompanied haptic exploration'. Koen 
(1971) manipulated the 'labelability' of the stimuli in a tactual-visual task and also attempted to 
determine the extent to which his subjects actually made use of labels, by means of a 
post-experiment interview. His results questioned whether ‘the verbal label is indeed the 
functional mediator in this kind of task'. Also, pilot work by Koen (1971) had led him to note 
that while tactual perception often evoked reports of visual imagery the converse rarely 
occurred. 

This is of interest because Jones & Connolly (1970), on the basis of length matching in a 
kinaesthetic-visual task, have suggested that information from the first observed stimulus is 
recoded in the modality of the second. This would mean that in a tactual to visual match the use 
of visual imagery would enhance performance but not for a visual to tactual match, where 
"tactual-imagery' would be required. Pick (1970) has however suggested a simpler hypothesis, 
that information in all modalities is translated into a visual code. The present study therefore set 
out to investigate whether visual imagery would aid only those matching conditions in which the 
second presented stimulus was visual (e.g. visual-visual and tactual-visual) or if indeed visual 
imagery would also play a role in visual-tactual and tactual-tactual matching. 

Ninety undergraduates were given the Vividness of Visual Imagery Questionnaire (VVIQ) 
developed by Marks (1973). On the basis of their VVIQ scores the 12 most extreme subjects at 
either end of the distribution were chosen to form a group of good and a group of poor visual 
imagers. 

In order to increase the difficulty in labelling and thus increase the use of visual imagery 
(Koen, 1971), the stimuli employed were equal numbers of randomly shaped polygons, of 6, 8 or 
12 sides (Attneave & Arnoult, 1956). Of the 24 pairs of stimuli thus created, half contained one 
within-pair difference in shape while the remainder were identical. Differences were obtained by 
selecting one point or angle of the figure at random and moving this point in a randomly chosen 
direction for 2 cm. Tactual stimuli were cut out of 2 cm thick hardboard and visual stimuli were 
drawn on card. 

Each subject was seen individually and asked to respond ‘same’ or ‘different’ to each of the 
24 pairs of stimuli. Tactual stimuli were presented in a fixed position behind a screen which 
permitted one-handed exploration but excluded vision. After a practice trial subjects were 
administered the four (two intra-modal and two cross-modal) conditions in random order. Also 
within each condition the six pairs of stimuli were presented in random order. Each pair of 
stimuli was presented successively with a 2 sec interval between the members of each pair. In an 
attempt to equate the information-gathering capabilities of the two modalities visual stimuli were 
presented for a maximum of 5 sec, tactual stimuli for a maximum of 30 sec (Butter & Bjorklund, 
1973). 

A 2 (good and poor imagers) by 4 (matching conditions), repeated measures analysis of 
variance revealed significant main effects for both the imagery variable (F= 16-92, d.f. — 1, 22, 
P« 0-001) and the conditions variable (F= 10-22, d.f. =3, 66, P« 0-01). Further analyses of this 
latter effect by means of t tests revealed that the visual-visual condition (X= 4-79) was 
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significantly easier than the visual-tactual (X = 3-04, t= 6-16, d.f.=23, P<0-05) the 
tactual-visual (X= 3-54, t — 4-22, d.f. =23, P<0-05) and the tactual-tactual condition 
(X = 3-92, t= 3-82, d.f. — 23, P« 0-05). Similarly the tactual-tactual and visual-tactual 
conditions also differed significantly (t= 3:33, d.f. — 23, P< 0-05). 

This last result (about which no hypothesis was entertained) where different levels of 
performance were associated with the different matching conditions, is the most difficult to 
explain. As Goodnow (1971) has noted, no consistency appears in the literature where the 
various combinations of intra- and cross-modal conditions have been compared except that the 
visual-visual condition is always the easiest. This makes it difficult to assess the importance of 
the difference between the tactual-tactual and visual-tactual condition, while the remaining 
differences between the visual-visual and the other conditions are probably best explained by the 
fact that these latter conditions all involved tactual perception and it is likely that visual images 
formed from tactual information are less than perfect. 

These results are of course secondary to the main results which indicate that good visual 
imagers were at an advantage in all matching conditions. That is, the lack of an imagery by 
conditions interaction suggests that even in the tactual-tactual condition the ability to produce 
clearer visual images was an asset. Therefore the importance of this study is that the results lend 
support to the theory of Pick (1970) but not of Jones & Connolly (1970) because it appears that 
visual imagery played a role even in those matching conditions where the second observed 
stimulus was tactual. 
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Convergence-divergence and the learning of concrete and abstract 
sentences 


H. V. Sacks and Michael W. Eysenck 





School pupils (mean age 17 years) were classified as convergers or divergers on the basis of their 
performance on the AHS Intelligence Test and the Uses of Objects test. They were presented with a set of 
concrete and abstract sentences, followed immediately by a forced-choice recognition test. There was a 
significant interaction between abstractness-concreteness and convergence-divergence, in which the effect of 
the abstractness-concreteness variable was highly significant for convergers, but was non-significant for 
divergers. This interaction was attributed to the superior ability of divergers to discover those different 
interpretations of a single nominal stimulus held to be inherent in abstract, though not in concrete, sentences. 
In addition, some evidence against Paivio's (1971) dual-coding model was presented. 





A considerable body of experimental evidence, reviewed by Paivio (1969, 1971), indicates that 
concrete words are better retained in memory than abstract words. Paivio (1968) found that word 
abstractness-concreteness was more strongly associated with performance on free recall and 
paired-associate learning tasks than several other variables, including word frequency and 
meaningfulness. Other work has concentrated on sentence memory, the usual finding being that 
concrete sentences are better remembered than abstract sentences (e.g. James, Thompson & 
Baldwin, 1973; Pezdek & Royer, 1974). 

Theoretically, Paivio (1971) proposed a dual-coding model, according to which concrete 
material has access to both a non-verbal imaginal code and a verbal symbolic code, whereas 
abstract material has access only to the latter. While this theory has received support from 
studies by Begg & Paivio (1969) and by Klee & Eysenck (1973), other evidence is inconsistent 
with the dual-coding model (e.g. Tieman, 1972; James et al. 1973). Furthermore, Johnson, 
Bransford, Nyberg & Cleary (1972) pointed out that Begg & Paivio (1969) had confounded 
abstractness—concreteness and comprehensibility, i.e. abstract sentences tended to be more 
difficult to comprehend than concrete sentences. 

An alternative interpretation of these data was offered by Klee & Eysenck (1973), who found 
that concrete sentences were consistently more rapidly comprehended than abstract sentences. 
They suggested that concrete sentences were likely to have a single dominant interpretation 
which could be readily comprehended, whereas abstract sentences tended to incorporate several 
potential interpretations, and so could only be interpreted unequivocally in the presence of a 
disambiguating context. This suggestion was supported by the work of Pezdek & Royer (1974). 
They found that the presentation of a disambiguating context at input enhanced recognition 
memory for abstract sentences, but not for concrete sentences. Klee (1975) presented phrases to 
her subjects, and required them to supply as many responses as possible in 90 sec that would 
complete the sentence. The greater interpretative variability inherent in the abstract sentences 
was revealed by two of the findings: (1) phrases containing abstract verbs produced responses 
belonging to a larger number of independent conceptual groupings than did phrases containing 
concrete verbs; (2) there was more response variability across subjects with abstract phrases 
than with concrete phrases. 

Henceforth we shall refer extensively to the interpretative variability hypothesis, i.e. the 
notion that a crucial difference between abstract and concrete sentences lies in the fact that the 
former can be interpreted in a greater variety of ways than the latter. A possible implication of 
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this hypothesis is that performance on concrete and abstract sentences would be systematically 
affected by the convergence-divergence factor. More specifically, Guilford (1971) has argued 
that, ‘convergent production rather than divergent production is the prevailing function when the 
input information is sufficient to determine a unique answer' (p. 171). Convergent thinking would 
thus be more appropriate when attempting to comprehend and learn concrete sentences than 
abstract sentences. On the other hand, Guilford (1971) defined divergent production as follows: 

* Generation of information from given information, where the emphasis is upon variety and 
quantity of output from the same source' (p. 213). Divergent thinking would thus be more 
appropriate for comprehending and learning abstract sentences than concrete sentences. 

Some experimental evidence indicates that imaginal processes are more prevalent in conver- 
gers than divergers. For example, Hudson (1968) rated experimental physicists as highly 
convergent and psychologists as somewhat divergent, and Roe (1953) found that the former 
group reported habitual use of imagery to a significantly greater extent than did the latter group. 
In addition, Hudson (1966) reported that the superiority of convergers to divergers on traditional 
tests of intelligence was particularly great (P< 0-001) on those questions involving diagrams. One 
extreme converger solved mathematical problems wherever possible, ' not numerically or 
algebraically, but in terms of diagrams, patterns and spatial models’ (p. 115). It should, perhaps, 
be noted that the term imagery is rather amorphous, having been used in the literature to refer to 
spatial and geometrical skills, as well as to the processes involved in daydreaming. 

In sum, convergent thought processes appear to be more appropriate for the extraction of 
meaning from concrete than from abstract sentences, whereas the opposite is true of divergent 
thought processes. Since the evidence indicates that the extraction of meaning is of crucial 
importance to sentence retention (e.g. Bobrow & Bower, 1969; Treisman & Tuxworth, 1974), the 
hypothesis follows that convergers will show the conventional memorial superiority for 
concrete over abstract sentences, whereas divergers will not. 


Method 
Subjects 


The subjects were 40 pupils at school, 19 male and 21 female, all of whom were studying for GCE ‘A’ level. 
Their mean age was 17 years 7 months. 


Materials and procedures 


All subjects received a tape recording consisting of 18 sentences, spoken at the rate of one sentence every 
eight seconds. The sentences were all of the form ‘THE+SUBJECT NOUN+WAS+COMPARATIVE 
ADJECTIVE+THAN+THE+OBJECT NOUN’, and all adjectives used had antonyms. All the sentences used and 
several others were rated for concreteness and comprehensibility in a pilot study by 56 pupils, mean age 17 
years 9 months. Comprehensibility was rated on a four-point scale, running from 1 (‘very easy to 
understand’) to 4 (‘very difficult to understand’). In order to minimize differences in comprehensibility, all 
the selected sentences fell within the range 1-63 to 2-68. Concreteness was rated on a six-point scale, running 
from 1 (‘extremely abstract’) to 6 (‘extremely concrete"). The selected concrete sentences had a mean 
concreteness rating of 4-60, against a mean concreteness rating of 3-09 for abstract sentences (t= 8-73, 
d.f. — 10, P« 0-001). The selected sentences were closely matched for number of syllables, and avoided 
homophony and alliteration. All nouns and adjectives were high-frequency words with 50 or more 
occurrences per million words according to the Thorndike-Lorge (1944) word count. 

The first three and the last three sentences in the presentation order were used as fillers to control for 
primacy and recency effects. The remaining 12 sentences comprised six abstract and six concrete sentences, 
presented in random order. Prior to presentation of the sentences, half the subjects were instructed to use 
imagery, and the other half were given conventional learning instructions. 

A forced-choice recognition test followed immediately after the presentation of the sentences. Each 
forced-choice decision involved selecting one from a tetrad set of sentences constructed as follows: one 
sentence was an input sentence; one was derived from this input sentence by interchanging the subject and 
object nouns; one was identical to the input sentence, except for the replacement of the comparative 
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adjective with its antonym; and one involved both replacing the comparative adjective with its antonym and 
interchanging the subject and object nouns. An example of a concrete and of an abstract set of recognition 
test sentences is given below: 


The soldier was thinner than the master 
The master was thinner than the soldier 
The soldier was fatter than the master 
The master was fatter than the soldier 


The moral was longer than the advice 
The advice was longer than the moral 
The moral was shorter than the advice 
The advice was shorter than the moral 


Subjects had eight seconds to indicate with a tick or a dash which sentence was the input sentence. The 
order of testing the sentences was the same as the input order, so that a constant retention interval of two 
minutes six seconds was employed for all sentences. The nature of the subsequent retention test was 
indicated to the subjects prior to the sentence input phase of the experiment. 

In order to measure convergence-divergence, subjects completed Part I of the AHS Group Test of High 
Grade Intelligence, and the Uses of Objects test in a form based on that used by Hudson (1968). It was felt 
that Part I of the AHS represented a more adequate measure of convergent-thinking ability than Part II. The 
questions in Part I typically require correct ordering of material as tests of deductive reasoning, whereas Part 
II is more concerned with testing spatial abilities. The common objects used in the Uses of Objects test were 
a light bulb, a car tyre, a pot of jam, a telegraph pole, and a pane of glass. No time limit was imposed. 
Performance on the Uses of Objects test was scored by two independent judges, the criterion being that 
each completely different use for an object was awarded a mark, provided that it was physically possible. 

Scores on both tests were ranked, and there was a non-significant correlation between the sets of rankings 
(Spearman rank-order correlation coefficient =—0-01). Twenty per cent of the subjects having similar 
rankings on both tests were excluded, leaving 32 subjects. Of these, 16 had a higher ranking on the AHS 
Group Test than the Uses of Objects test, and were classified as convergers. The remaining subjects had a 
higher ranking on the Uses of Objects test than the AHS Group Test, and were classified as divergers. The 
convergers significantly outperformed the divergers on the AHS Group Test (t= 3:89, d.f. 2 30, P< 0-001), 
and the divergers significantly outperformed the convergers on the Uses of Objects test (t= 5:09, d.f. — 30, 
P« 0-001). 

In sum, the experimental design comprised three factors (convergence-divergence, instructional set, and 
abstractness-concreteness), with two levels of each factor. 


Results 


The major analysis was performed on the correct recognition data, the means of which are 
presented in Table 1. These data were analysed by means of a 2 (convergence-divergence) by 2 
(instructional set) by 2 (abstractness-concreteness) analysis of variance, with the first two factors 
between subjects and the last factor within subjects. Of the main effects, only that of 
abstractness-concreteness was significant (F= 8-03, d.f. = 1, 28, P« 0-01). The critical 


Table 1. Mean number of abstract and concrete sentences correctly recognized as a function of 
convergence-divergence and instructional set (maximum score = 6). Standard deviations are in 
parentheses 


Convergers Divergers 
Imagery set Control set Imagery set Control set 
Abstract 3-38 (0-92) 2-50 (1-60) 3-63 (1:30) 3:38 (1-06) 


Concrete 4:38 (1-19) 4-13 (0-83) 3-63 (2:00) 3-50 (1-51) 
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Figure 1. Probability of correct recognition as a function of abstractness-concreteness and 
convergence-divergence. 


interaction bétween abstractness-concreteness and convergence-divergence was significant 
(F= 6-64, d.f. — 1, 28, P« 0-02), and is shown in Fig. 1. In this interaction, the effect of 
abstractness-concreteness was highly significant for convergers (F= 14-62, d.f. = 1, 28, 
P« 0-001), but there was no effect of abstractness-concreteness for divergers (F< 1). Of the 16 
convergers, 12 cotrectly recognized more concrete than abstract sentences, with no subjects 
recognizing more abstract than concrete (P< 0-001 on a sign test). Of the 16 divergers, seven 
correctly recognized more concrete than abstract sentences, with five subjects doing the reverse 
(non-significant on a sign test). Four sentences were recognized equally well by convergers and 
divergers, and one sentence in the abstract set was better recognized by convergers than by 
divergers. The remaining seven sentences were better recognized by convergers if concrete and 
by divergers if abstract, as predicted (P= 0-025 on a Wilcoxon matched-pairs signed-ranks test). 
A further analysis was concerned with the types of errors made by the subjects on the 
recognition test. If the input sentences were of the form ‘A >B’, then marking the distractor 
‘B> A’ would be an error that maintained the correct adjective (an A error), marking the 
distractor ‘A < B’ would maintain the noun order (an N.O. error), and marking the distractor 
*B <A’ would be an error maintaining the meaning of the input sentence (an M error). Paivio 
(1971) has claimed that concrete sentences are stored as images incorporating the meaning of the 
sentences, whereas abstract sentences are stored as strings of words. A reasonable prediction 
from this theory (cf. Begg & Paivio, 1969) is that distractors preserving the meaning of the input 
sentence will be selected relatively more frequently in the case of concrete than of abstract 
sentences. The error data presented in Table 2 indicate that the percentage of meaning- 
preserving errors with concrete sentences (29-2 per cent) was almost identical to the percentage 
of meaning-preserving errors with abstract sentences (28-2 per cent). The error data, based on 
n — 30 in order to manipulate the convergence-divergence factor more rigorously, were analysed 
by means of a 2 (convergence-divergence) by 2 (abstractness-concreteness) by 3 (error type) 
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Table 2. Percentages of A, N.O. and M errors to abstract and concrete sentences as a function 
of convergence-divergence 








Convergers Divergers 

Abstract Concrete Abstract Concrete 
M error 30-4 34-6 25-6 28-2 
N.O. error 37-0 19-2 48-8 25-6 
A error 32:6 46-2 25-6 46-2 


Total 100-0 100-0 100-0 100-0 








analysis of variance, with the first factor between subjects, and the last two factors within 
subjects. The total number of errors was significantly less for concrete than for abstract 
sentences (F= 9-69, d.f. — 1, 28, P« 0-005). The interaction between convergence-divergence 
and abstractness-concreteness was significant (F= 6-73, d.f. = 1, 28, P« 0-02): convergers made 
many more errors on abstract than on concrete sentences, whereas divergers made an equal 
number on both sentence types. The other significant interaction was between abstractness- 
concreteness and error type (P« 0-05). In this interaction, relatively more A errors were made 
with concrete than with abstract sentences, and relatively more N.O. errors were made with 
abstract than with concrete sentences. The last finding may be relevant to Paivio's (1971) 
hypothesis that visual memory images are specialized for parallel or spatial processing, whereas 
verbal memory codes are specialized for sequential or temporal processing. The interaction 
between convergence-divergence and error type was not significant (F< 1). 


Discussion 

The major finding was the highly significant interaction between convergence-divergence and 
abstractness-concreteness, in which convergers showed considerably better recognition of con- 
crete than of abstract sentences, whereas divergers performed equivalently on the two sentence 
types. Since the effects of the abstractness-concreteness variable are typically strong and 
reliable, the finding that divergers are unaffected by this variable may prove of importance. The 
favoured interpretation of the data assumes that full comprehension of abstract sentences 
requires the conjoint consideration of more interpretative possibilities than is the case with 
concrete sentences (the interpretative variability hypothesis). Since the ability to produce several 
disparate interpretations for a single nominal stimulus is claimed to be a crucial characteristic of 
divergent thought, but not of convergent thought, the prediction of a significant interaction 
between convergence-divergence and abstractness—concreteness was made. 

Burt (1962) and Hasan & Butcher (1966), among others, have concluded that there is no 
clearcut distinction between convergent and divergent ability, whereas other researchers have 
reached the opposite conclusion (e.g. Hudson, 1966, 1968; Wallach & Kogan, 1965). While the 
evidence is equivocal, Guilford (1971) has suggested that although high IQ is not a sufficient 
condition for high divergent-production ability, it may be almost a necessary condition. Thus, in 
order to sample a wide range of divergent-production ability, it would be necessary to use 
subjects of above-average IQ, as was done in this experiment. Haddon & Lytton (1968) showed 
that a division of subjects into convergers and divergers was less successful in groups of average 
or below-average IQ. 

Begg & Paivio (1969) and Paivio (1971) hypothesized on the dual-coding model that subjects 
should retain the meaning rather than the wording of concrete sentences, and the wording rather 
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than the meaning of abstract sentences. However, several negative findings have recently Leen 
reported in the literature (e.g. Johnson, 1972; Tieman, 1972; James et al. 1973). One of Tieman's 
(1972) experiments is of particular relevance, in that he used simple comparative sentences of 
the type used in this study, and presented forced-choice recognition tests similarly constructed to 
those in this experiment. He found no effect of abstractness-concreteness on error type, and 
concluded that his data were at variance with the predictions of Paivio (1971). However, in his 
study there was a highly significant difference in the distribution of the adjectives by which the 
concreteness-abstractness variable was governed into high-frequency (50 per million and above) 
and low-frequency (49 per million and below) words (x? = 10-97, d.f. = 1, P< 0-001). Word 
frequency was carefully controlled in the present study, and a significant 
abstractness-concreteness Xerror-type interaction was obtained. Paivio's (1971) hypothesis can 
most appropriately be examined in terms of the following formula (Tieman, 1972): 


ET (error term) = No. of M errors 9. Of N.O. emors+No. of A errors, 

If the error term is positive, this indicates that recognition errors are retaining the meaning rather 
than the wording of the input sentences; if the error term is negative, then errors are retaining 
the wording rather than the meaning of the input sentences. The error term was slightly negative 
for both concrete and abstract sentences, and was not differentially affected by abstract and 
concrete sentences in the way Paivio (1971) would predict. However, the tendency for N.O. 
errors to be relatively more frequent with abstract than with concrete sentences is consistent 
with the hypothesis (Paivio, 1971) that the verbal processes used with abstract material are 
specialized for sequential or temporal processing. 

The failure of imagery instructions to enhance retention-test performance contrasts with 
several earlier findings (e.g. Bower, 1972). A possible explanation for this finding stems from the 
work of Anderson & Kulhavy (1972). They instructed half their subjects to use imagery while 
learning a prose passage, while the control subjects were given conventional learning 
instructions. Retention was unaffected by instructions. However, a post-experimental 
questionnaire indicated that more than half of the control subjects employed imagery while 
studying the passage, and about one-third of the subjects receiving imagery instructions reported 
not using imagery or not doing so consistently. 

It should be apparent that this study has raised more questions than it has answered. The 
authors believe that there are important interrelationships among convergence-divergence, 
imagery ability, verbal ability, and memory, but it is probable that further advances will require 
increased conceptual precision. At the experimental level, further work in this area might with 
advantage investigate speed of comprehension of the two (or more) alternative interpretations 
contained in ambiguous sentences. It follows from the hypothesis developed here that divergers 
should extract the second (and subsequent) interpretations of ambiguous sentences more rapidly 
than convergers. 
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Pattern detection by mongol and non-mongol subnormals 


G. McDonald and D. N. Mackay 





Ten mongols and ten clinically heterogeneous subnormals matched on chronological age, mental age and digit 
span took part in an experiment in which tape-recorded supra-span digit sequences with different patterns 
were presented. There were six patterns: random, mirror (e.g. 583385); same-digit pairs (e.g. 558833), 
same-digit throughout (e.g. 333333), couplet repetition (e.g. 585858) and triplet repetition (583583). The 
numbers of digits correctly recalled in any order by the mongols in the various conditions ranked from least 
to most were: random, mirror, same-digit pairs, same-digit messages, triplet repetition and couplet repetition. 
The rank order for the non-mongols was the same except that the positions of couplet and triplet repetition 
were reversed. Mongols had significantly poorer recall for random, mirror and same-digit-pair messages than 
non-mongols but were their equals in other conditions. The mongols' performance was more sensitive to 
pattern than the performance of the other subjects. There was some evidence that, in the messages with 
same-digit pairs and the same digit throughout, all subjects (but mongols in particular) tended to insert new 
digits into the response sequence and that the digit introduced was the next one in simple arithmetic 
progression. It would appear that the hypothesis about poor auditory-vocal channelling capacities of mongols 
needs qualification. 





The concept of subjective organization in a learning task now covers so many different 
operations that any definition of it must necessarily be loose. Tulving (1968) suggested that, in 
tasks involving free recall, clustering by categories be distinguished from subjective organization. 
The former occurs when items from the same class are recalled together. The latter occurs when 
subjects impose their own organization upon apparently unrelated items. This distinction is 
useful but only when applied to certain types of learning. Recent work with children suffering 
from different handicaps has pointed up other types of organization. In some of their experi- 
ments O'Connor & Hermelin (1974) were concerned to know how blind, autistic and subnormal 
children coded three-digit sequences when temporal presentation differed from spatial presenta- 
tion. Frith (1970) read out binary word sequences to autistic, subnormal and normal children. 
These sequences had certain patterns. One, for example, had the following form: ‘spoon horse 
spoon horse horse spoon'. She found that while normal and subnormal children tended to recall 
the items according to the presentation pattern, autistic children not infrequently imposed their 
own order on the material. 

Another experiment which had less rigid stimulus patterns is reported by Spitz, Goettler & 
Webreck (1972). They were interested in finding out whether subnormals (mean IQ: 62) could 
detect, and make use of, patterns in visually presented digit sequences. In this material there 
was ‘repetition redundancy’ (e.g. 528528) or ‘couplet redundancy’ (e.g. 552288). Recall scores 
showed that the subnormals found it relatively easy to perceive and make use of couplet 
redundancy in the learning task. However, they found it difficult to recognize the presence of 
repetition redundancy and the experimenters had to introduce cues of spacing and underlining. 
The method used to display the stimuli merits annotation. A given digit message was presented 
on a card. Subjects had time to scan the sequence, to check on order, to verify that there was a 
pattern and even to rehearse. In this context subjective organization includes feature extraction 
and the formation of learning strategies. 

We were interested in finding out whether mongols and non-mongol subnormals could detect 
and make use of patterns in orally presented digit sequences. The sequences 585858 and 583583 
could be construed as repetitions of two- and three-digit strings respectively. If so interpreted, 
subjects could theoretically reproduce only one component, i.e. a couplet or triplet. Further- - 
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more, with this type of presentation, subjects would not be able to check at will on the structure 
of the message. They had been able to do this in the study by Spitz et al. (1972) where 
presentation was visual. 

The main purpose of the study, however, was to compare the performance of mongols and 
non-mongols. In his comprehensive review of the literature Belmont (1971) concludes that 
mongols are inferior to other subnormals on auditory-vocal tasks. We have found that they are 
particularly vulnerable to proactive and retroactive interference (McDonald & MacKay, 1974a, 
b). More recently we (MacKay & McDonald, 1976) found that they can detect and make use of 
patterns in digit sequences. Unfortunately, the sequences used in this last experiment were of 
the same length as the ones on which matching had been based and the full effects of pattern 
detection were not observed, particularly in the case of non-mongols. In the present experiment 
we used supra-span messages and also introduced more patterns into the stimuli. 

The amount correctly recalled irrespective of order was of much more interest to us than 
correct recall in the right order because we had found that both groups of subjects reproduced 
the sequences according to presentation patterns (MacKay & McDonald, 1976). Nevertheless we 
did check on order recall simply to see whether there were differences between groups with 
supra-span messages. 

When this and other experiments were first designed we had intended to use subnormal 
children as subjects. Not surprisingly, we could not get enough mongols whose MA and digit 
span were comparable to those of non-mongols. We therefore had to employ adult subjects. 


Method 
Subjects 


Ten mongol and ten non-mongol subnormals matched on CA, MA and digit span took part. The non-mongol 
subjects comprised a clinically heterogeneous group: four were epileptic, three were familial and three were 
undifferentiated. Details of age and ability are summarized in Table 1. Matching on a digit-span of four had 
been strict: the criterion for inclusion was that the subject be able to reproduce in any order a tape-recorded 
random four-digit sequence in three successive trials. Successful candidates were then given a five-digit 
sequence. If they reproduced the digits correctly in any order they were dropped from the investigation. 
Originally, 17 mongols with MAs of between 5 years 7 months and 7 years 2 months were tested for digit 
span. Five were rejected because they failed to reach criterion and one because he exceeded criterion. 
Another was omitted simply in order to make the sample sizes equal. 

All subjects come from the southern part of the province. They live at home and attend special centres on 
a daily basis. None had ever taken part in this sort of experiment before. 


Table 1. Mean chronological and mental ages of the subjects 


Mongols Non-mongols 
CA 
Mean 24-62 25-80 
S.D. 2.47 1-75 
MA 
Mean 6:01 6-05 
S.D. 0-40 0-30 


Digit span 4 4 
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Material and procedure 


The material comprised tape-recorded six-digit messages presented at the rate of one per second. Six 
structures were used: 


Type Example 
Random 385169 
Couplet repetition 585858 
Triplet repetition 583583 
Mirror 583385 
Same-digit pairs 558833 
Same-digit message 555555 


In couplet and triplet repetition obvious arithmetical relationships were avoided (e.g. 565656). In mirror 
messages the last three digits were the same as the first three, but in reverse order. 

There were ten trials under each condition and all subjects were tested on all conditions. In the same-digit 
messages each digit 1-9 appeared once for each subject and one digit appeared twice. The order in which the 
conditions were presented was random. 

Each subject was brought to the clinical room of the special centre and approximately 15 min were spent 
in re-introducing him to the task and allowing him to settle down (subjects were familiar with the ‘numbers 
game’, as it was described, because a tape-recorder had been used in the matching procedure). Five pre-test 
trials were administered under each of the conditions at the beginning of each session. The subject was given 
20 sec in which to recall a sequence. Another 30 sec elapsed before the next message was presented. During 
this period the subject was engaged in general conversation which, in effect, meant talking about his family, 
his friends at the centre and recent events in his own personal life. Seven seconds before the presentation of 
the next sequence he was told ‘Now listen carefully and tell me what numbers you hear’. No more than 20 
test trials were given in any one session and a period of between 24 and 72 hours separated sessions. 


Results 


Each response sequence was recorded. The number of digits reproduced by mongols for each 
message (irrespective of whether they were right or wrong) varied between three and eight. The 
number depended on the pattern of the stimulus message: for mixed and mirror sequences the 
mean was 5:1; for couplet and triplet repetition it was 6-0 and for the same-digit pairs and 

same digits throughout it was 6-7. By contrast, the mean number of digits in the response 
messages of the non-mongols remained stable at six for all types of structure. 

To take into account the correct recall of digits on chance grounds alone, raw item scores (the 
number of digits correctly recalled in any order) were adjusted by means of the Bousfield & 
Bousfield (1966) formulae. To reduce marked heterogeneity of variance, square root transforma- 
tions of the adjusted scores were carried out. Cochran’s test of homogeneity (Winer, 1962) 
revealed that these transformations were well within the acceptable limits. 

The mean item scores of the mongols and non-mongols are shown in the upper parts of Tables 
2 and 3 respectively. These means are given in order of increasing size and it should be noted 
that there is a slight difference between the two tables in the order of the conditions. A 2x6 
analysis of variance with repeated measures showed that the main effects were significant 
(subject groups, F= 19-19, d.f.=1, 18, P<0-01; message structure, F= 26-19, d.f. — 5, 90) and 
that the interaction was also significant (F= 4-50, d.f. = 5, 90, P<0-01). Tukey (a) tests (0-01 
level) were used in the more detailed analysis. The significant differences between types of 
message structure are marked with an asterisk in the lower parts of Tables 2 and 3. 

The mongols’ performance was significantly inferior to that of non-mongols when the messages 
were mixed and had either a mirror pattern or same-digit pairs. 

To take into account the correct recall of items on chance grounds alone, order scores (the 
number of digits correctly recalled in the right order) were adjusted along the lines suggested by 
Fagan & Lowden (1967). Square root transformations were used in order to reduce heterogeneity 
of variance below the limits set by Cochran’s test of homogeneity. A 2x5 analysis of variance 
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Table 2. Mean item scores of mongols 


Message pattern 


Same-digit Same-digit Triplet Couplet 
Random Mirror pairs message repetition repetition 
Mean ... 827 10-24 10-65 12-09 12-49 13-22 
S.D.  ... 1-44 1-63 . 1:73 0.92 1:55 1-10 
1-97* 2:38* 3-82* 422* 4-95* 
0-41 1-85* 2:25* 2.98* 
1-44* 1-84* 2:57* 
0-40 1:13 
0:73 


Table 3. Mean item scores of non-mongols 





Message pattern 
Same-digit Same-digit Couplet Triplet 
Random Mirror pairs message repetition repetition 
Mean ... 11-34 12-24 12-52 12-87 13-39 13-57 
S.D.  ... 1-48 1:30 142 0-75 1-11 0-85 
0-90 1-18* 1-53* 2-05* 2-23* 
0-28 0-63 1-15 1-33* 
0-35 0-87 1-05 
0-52 0:70 
0-18 


(for obvious reasons the condition involving same-digit messages is omitted) revealed significant 
main effects (subject groups, F= 4-81, d.f. — 1, 18, P« 0-05; message structure, F= 152-82, 

d.f. = 4, 72). The interaction was also significant. Tukey (a) tests (0-01 level) showed that the 
mongols’ performance was significantly poorer than that of the non-mongols for sequences which 
were random and had a mirror pattern or same-digit pairs. i 


Discussion 


This brief discussion willtake as its starting point the similarities in the performances of the two 
groups. First, the rank. orders of item scores across conditions were, with one minor exception, 
the same for both-gpifiipés. That is to say, they found random messages the most difficult and 
couplet or triplet repetition the easiest to recall. Second, the patterns in the response messages in 
most cases closely zefle&fed the patterns in the stimulus sequences. Third, the mongols were the 
equal of non-mongols when messages had the same digit throughout, or had triplet and couplet 
repetition. Clearly, Belmont's (1971) hypothesis about a deficit in the auditory-vocal chanelling 
capacities of mongols appears to need some sort of qualification. 

There is the perhaps curious fact that the stimulus message with the most redundant 
information (aaaaaa) was not the easiest to recall. We examined the original responses and found 
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that there was a marked tendency for both groups of subjects, but the mongols in particular, to 
introduce a new digit and that this intruding item was usually the next in simple arithmetic 
progression. For example, in response to the message 333333 subjects tended to reply as follows: 
333444 or 334455. This intrusion of arithmetic progression also occurred in sequences with 
same-digit pairs. For example, in response to the string 228855 there was a tendency to reply as 
follows: 223344 or 22889. 

Of more interest than the similarities between groups are the differences between them. When 
the messages had no pattern (random) or had mirror structure the mongols were significantly 
poorer in item scores than the non-mongols. Mongols were also inferior on same-digit pairs: 
examination of original response sequences showed that they tended to introduce new digits 
according to arithmetic progression more often than non-mongols. (We did not check this 
statistically.) It would seem, therefore, that when a pattern is not obvious mongols are able to 
process supra-span messages less efficiently than other subnormals even though they were 
originally matched on this variable. As a matter of fact, the non-mongols were correctly 
reproducing an average of four digits in any order in response to random messages whereas the 
mongols were reproducing an average of between two and three, allowing for adjustment. This 
means that supra-span messages without patterns affect the two groups in different ways. We are 
at present investigating the possibility that the last two items in such messages have a marked 
retroactive interference effect in the case of mongols. 

The mongols’ performance in most conditions seemed to be more sensitive to message 
structure than the other subjects’. Eleven of the possible 15 differences were significant in item 
recall compared with only five in the case of the non-mongols. For example, the mongols' scores 
for mirror messages were significantly higher than their scores for random messages. In the 
mirror sequence the first and last items, and the third and fourth items, are identical. Given the 
relatively poor input processing of unpatterned material by mongols this sort of difference is not 
surprising. By the same token, the apparent imperviousness of non-mongols to pattern in many 
conditions may simply be due to the fact that their superior processing skills made pattern a less 
important factor in learning. 

Finally, we should like to anticipate the criticism that we did not have a representative sample 
of mongols. We are aware that we selected subjects who had a relatively good digit span but 
would point out that since they did not in some instances perform as well as the non-mongols, it 


is fairly evident that the deficits we observed would be even more marked in most mongols. 
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An experimental investigation of Piaget’s analysis of class inclusion 


Sara Meadows 


Piaget’s analysis of the class inclusion problem identifies three operations in its solution. His prediction of 
the order of achievement of these constituent operations was tested in two experiments. The results of both 
experiments were contrary to his hypothesis and lend no support to the importance of the suggested 
operations. Explanations for the order of achievement of constituents of class inclusion are discussed, and 
an alternative model of the class inclusion model is put forward. 


Piaget in his main analysis of classification behaviour (Piaget & Inhelder, 1964) considers the 
quantification of class inclusion. The critical problem is the comparison of a class B with its 
complementary subclasses A and A’. He notes that young children err by answering as if the 
question had required a comparison of A and A’; thus in a collection of flowers (B) of which 
more than half are primroses (A) they reply, when asked whether there are more flowers or more 
primroses, that there are more primroses. An experiment by Ahr & Youniss (1970) confirms this. 
Piaget believes that to solve the class inclusion problem the child must coordinate two opera- 
tions, components of grouping I; once he knows that A--A! =B and A = B—A! he can deduce 
that B > A. He supposes that the first two operations develop together in a mutually facilitating 
fashion, each being the logical inverse of the other; only when they are firmly established in the 
child’s repertory can the class inclusion question, are there more B or more A, be answered 
correctly. 

If this operational analysis, which Piaget has repeated in several works with unusual clearness, 
is correct, then there should be a clear order of development of the three operations. The 
concepts of AA! =B and A= B—A! should develop in close parallel, and that of B > A should 
lag behind. Few studies have tested this prediction. Kofsky (1966) included all three problems in 
a larger study of classificatory development, and her versions of the tests were used in a 
longitudinal study by Calvert (1972). Kofsky found that success on the A+A! = B comparison 
was achieved about a year before the A = B—A! comparison, and B > A was the hardest of the 
three, but this order was by no means universal. Calvert found the same order (B = A+A}, 

A — B-A!, B» A) on her third administration; on the others the order was B= A+A!, B » A, 
A — B- A!. Again there were numerous inconsistencies and reversals, particularly on B= A--A!. 

It may be doubted whether the three tasks used in these studies were really equal in situational 
complexity. For both the B= A--A! and the A = B— A! comparisons the array consisted of eight 
objects (B) of which six were A and two A! - in Kofsky's case B were squares, A blue squares 
and A! red squares. The B = A+A! questioning was straightforward, the question in both studies 
being of the form ‘If you put the reds (A) and the blues (A!) together would there be more of 
them, or more squares, or as many reds and blues as squares?’, and the required answer being 
“as many reds and blues as squares’. For A = B—A! the question Kofsky used was ‘If I took 
away all the reds, are there just blues left, just squares left, or both blues and squares?’ and the 
correct answer was ‘both blues and squares’; Calvert asked a simpler question ‘If I took away 
all the reds, what would be left, the reds or the blues or the squares?’ but required the same 
answer, ‘blue squares’. In neither case was ‘blue ones’, although an unambiguous description of 
what would be left after the hypothesized removal and a logically sufficient answer to at least 
Calvert’s form of the question, counted as correct. This may have contributed to the greater 
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difficulty of the question. In both studies the array for the B > A comparison consisted of nine 
objects, of which four were blue squares, two blue triangles and three red triangles. This allowed 
for two class inclusion questions [' Are there more blues (B) or more squares (A)?' and ‘ Are 
there more triangles (B) or more reds (A)?’] and a cross-class question (‘Are there more 
triangles or more blues?") but it also complicated the array. 

In view of the conflicting results and the methodological faults of these previous experiments, 
the class inclusion experiment was repeated as part of a longitudinal study of the development of 
concrete operations (Meadows, 1975) asking children the three questions about the same array. 
The predicted order of achievement of the three operations was, following Piaget, A= B— A! and 
B=A+A! together, then B > A. 


Experiment I 
Method 


A version of the class inclusion test was included in a battery of nine Piagetian tests used in a longitudinal 
study of the development of concrete operations (Meadows, 1975). The subjects were 120 children (60 boys, 
60 girls) aged between 5.0 and 11.10 from four primary schools in an Inner London borough; they were all 
English speaking and of average or above-average IQ. Testing was carried out individually away from the 
classroom and was repeated twice at six-month intervals. In each of the three testing sessions the child was 
shown a card on which were drawn eight coloured spots (B), five of them pink (A) and three blue (A’). The 
child was first required to point out ‘the pink ones’, ‘the blue ones’ and ‘the round ones’. Then the 
following questions were asked in random order: 

‘Are there more round ones or more pink and blue ones together?’ (B = A+A’), to which the required 
answer was ‘the same’; 

‘If I took away the blue ones, what would be left, the pink ones or the blue ones or the round ones?’ 
(A = B—A?)), to which the required answer was ‘the pink ones’ or ‘the pink round ones’; 

‘Are there more round ones or more pink ones?’ (B > A), to which the required answer was ‘more round 
ones’. 

Questions answered unclearly were repeated later, but no feedback was given as to the correctness of the 
answers. 


Results 


Performance improved over the three testings, partly because of the increasing age and maturity 
of the subjects and partly because of practice effects (F= 21-4, V,=1, V,;=114, P< 0-001) but 
the pattern of responses, which is shown in Table 1, was stable over administrations. 


Table 1. Distributions of answers on three questions relating to class inclusion operations 


(n= 120) 


V 


No. cases observed 


LM ——————————————— No. cases 
Results on Occasion of testing Total over expected 
Si o SO three (Piaget’s 
A=B-A’ B=A+A’ B>A 1 2 3 testings prediction) 
= = — 2 1 0 3 — 
+ - 48 22 7 TI A few or none 
- + - 0 0 0 0 A few or none 
= i t 4 2 1 7 None 
+ - + 35 51 42 128 None 
+ + — 2 2 2 6 A lot 
= + + 0 0 0 0 None 
+ + + 29 42 68 139 
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It may be seen that the A= B— A! comparison was very easy indeed, the question being 
answered correctly a total of 350 times out of 360, and being apparently the first question to be 
answered correctly for most children. A+A! =B was very hard; the question was answered 
correctly only 145 times out of 360, it was answered incorrectly 128 times when both the other 
questions were answered correctly, and it was never the only question correctly answered. 
B» A was of intermediate difficulty, rarely the first question correctly answered but also rarely 
the last. 

The relative difficulties of the three problems are clearly different from what Piaget predicted 
they would be. Statistical assessment of this difference is complicated by the many zero 
frequencies of response types which Piaget's model predicts. Because of them it is not possible 
to use a simple one-sample x? test on the data. Instead the weaker hypothesis that children's 
responses would be evenly distributed over the different response patterns was tested using a 
one-sample x? test. The numbers in the cells which Piaget’s model predicted would have zero 
frequencies were also subjected to a one-sample x’, with expected frequencies for each cell 
equal to one-third of the total cases. In both cases all the y*s were significant at or below the 
0-001 level. It was, therefore, concluded that the obtained distributions were radically different 
from that predicted by Piaget. 

This experiment, therefore, suggested that Piaget's order of achievement of operations was 
not correct; to make sure that the obtained order held for a variety of arrays a replication 
experiment was carried out. 


Experiment II 
Method 


Pictures of eight arrays (two examples follow as Fig. 1) differing in content (abstract shapes (crosses) or 
children), number of stimuli (B —8, B — 21), and presence or absence of stimuli not members of B (the 
‘distractor set’, these stimuli were two circles in the abstract arrays and two adult females in the children 
arrays) and accompanied by the three appropriate questions worded in the same form as in the first 
experiment, were administered as a multiple-choice group test. The subjects were 50 children aged from 7-9 
to 9-3 with a mean of 8-6 from the same area of North London as the subjects of Expt. I. The order of the 
questions accompanying the arrays, of B, A and A! in the questions, and of the possible answers were 
randomized between arrays; the order of arrays was randomized between groups of subjects. Thus each 
subject saw the same arrays but in one of four different random orders. The children were told that they 
would be shown pictures and would have to answer questions about them by underlining the correct answer; 
the booklet of arrays was gone through page by page, the tester reading the questions clearly and with even 
emphasis, and allowing a few minutes after each for the child to make his answer. The following 
experimental hypotheses were considered: 


H, The order of difficulty of questions on all arrays will be the same as in the first experiment, 
A=B-A!, B>A, B=A+Al. 

H, Arrays containing a large number of items will be harder than arrays containing a small number of 
items. 

H; The presence of a distractor set will decrease the difficulty of the B = A+A! question and increase 
the difficulty of the B> A question. 

H, Arrays containing familiar objects, e.g. children, will be easier than arrays containing unfamiliar 
objects, e.g. crosses. 


Differences between the eight arrays were assessed using Cochran's Q test (Siegel, 1956, pp. 161-166). 


Results 


H,, concerning the order of difficulty of the three questions, is supported in the case of all eight 
arrays. A= B-A! was equally easy for all arrays (Q= 6:31, V=7, P50-4) and so markedly 
easier than the other comparisons as to suggest it requires simpler skills. It will not be 
considered further in this section. B > A was intermediate in difficulty for all arrays, but there 
were significant differences in its difficulty between arrays (Q= 46.4, V — 7, P< 0-001). 
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Table 2. Number of subjects answering constituent questions on class inclusion replication 
arrays correctly (n = 50) 


Crosses Children 


A-B-A! B>A B=A+A! A-B-A! B>A B=A+A! 


Small number 

distractor present 47 23 14 50 38 10 
Small number 

distractor absent 47 19 13 46 30 5 
Large number 

distractor present — 47 10 9 49 43 9 
Large number 

distractor absent 48 I9 8 50 41 13 


Figure 1(a). Small number, distractor present, children array 
LOOK AT THIS PICTURE 





ARE THERE MORE BOYS AND GIRLS TOGETHER OR MORE CHILDREN? 
THE SAME 
MORE CHILDREN 
MORE GIRLS AND BOYS 
ARE THERE MORE CHILDREN OR MORE BOYS? 
MORE BOYS 
THE SAME 
MORE CHILDREN 
IF THE BOYS WENT AWAY, WHAT WOULD BE LEFT, THE CHILDREN OR THE 
BOYS OR THE GIRLS? 
THE GIRLS 
THE BOYS 
THE CHILDREN 
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Figure 1(b). Large number, distractor absent, crosses array 
LOOK AT THIS PICTURE 





ARE THERE MORE BIG ONES AND LITTLE ONES TOGETHER OR MORE CROSSES? 
MORE CROSSES 
MORE BIG ONES AND LITTLE ONES 
THE SAME 


IF I TOOK AWAY THE BIG ONES, WHAT WOULD BE LEFT, THE LITTLE ONES OR 
THE BIG ONES OR THE CROSSES? i 
THE LITTLE ONES 
THE CROSSES 
THE BIG ONES 


ARE THERE MORE CROSSES OR MORE BIG ONES? 
MORE CROSSES 
THE SAME 
MORE BIG ONES 


B=A+A! was the hardest comparison for all arrays, but again there were significant differences 
between arrays (Q= 18-8, V=7, P<0-02). These differences are examined in the discussion of 
the subsidiary hypotheses, H,, Hs and H4. 

Hy, that large arrays would be harder than small ones, is supported in the case of the arrays of 
crosses but not in that of the arrays of children. However the large arrays of children were not 
pictured but verbally suggested (‘Think of all the children in your class’); Wohwill (1968) and 
Inhelder, Sinclair & Bovet (1974) found that verbally presented problems were easier than 
pictorially presented ones, and this effect may have obscured the small differences due to the 
larger size of the array. 

Hi, concerning the effect of a distractor set, was not supported in either of its parts. The 
presence of a distractor set made little or no difference to the difficulty of the B= A+A! 
comparison, and it made the B > A comparison easier rather than harder on both small number 
arrays and on the large number (children) array; comparison on the large number (crosses) arrays 
is confused by the uniquely low success rate for B> A on the large number, distractor present 
array of crosses. It would appear, however, that the ‘distractor set’ did not distract the subjects 
from the correct answer; if anything, it helped them by providing a contrast to B which 
emphasized its existence as a Class. 

Hy, that arrays containing familiar objects such as children would be easier than arrays 
containing unfamiliar geometric shapes was also only partially confirmed. The B > A (children) 
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question was much easier than the B > A (crosses) question, but the B = A+A’ (crosses) 

question was easier three out of four times than the B = A+A? (children) question, which was 

the problem that evoked most overt puzzlement amongst the subjects. One reason may be that 
children are very familiar with the fact that boys and girls are also children; they, therefore, find 
it easy to say that there are more children than boys, rarely making the erroneous comparison of 
boys with girls. Their idea of this identity is so strong however, that the question ‘are there 
more children or more boys and girls together?’ is registered as completely tautologous and thus 
non-sensical, and the question is rephrased as the more sensible ‘more boys or more girls?’ or 
‘more children or more boys?’. Many errors on this question in this experiment consisted of an 
underlining of one noun in the answer, thus ‘more boys and girls together’. 

Discussion and conclusions Sg 
These two experiments reveal an order of achievement of operations radically different froin that 
suggested by Piagetian theory, which is also unable to cope with the apparently systematic 
differences between arrays found in the second experiment. It would appear that class inclusion 
problems may be solved by processes other than those which Piaget has described; an alternative 
explanation of the phenomena is necessary. Such an explanation may more easily be reached if 
the three constituents of class inclusion are considered separately. 

Very young children find it easy to answer correctly questions of the form ‘if I take away the 
A! ones, what is left?’; some were observed to cover the A! with their hand, look and answer. 
Such an action, actual or imagined, is probably sufficient for the solution of the problem, which 
is the only one of the three operations not to require a multiple classification of any of the items 
in the array. A form of question which like Kofsky’s required a distinction between ‘blue ones’ 
and ‘just blue ones’ when the remainder was blue squares would undoubtedly be more difficult, 
if only because children have little facility with this logical construction. 

Errors on the B= A+A! comparison tended to be an answer of ‘more A’, as if the question 
had related A and A!. To succeed on this problem the child must count the B, hold that in 
memory, count the A and the A! together and compare this result with the first sum; or he must 
recognize that there are no B which are not A or A’, and no A or A’ which are not B, and 
thence infer the equality. Whichever strategy he uses, he must be able to recognize the defining 
characteristics of B, A and A!, and he must overcome his expectation that the items in a 
comparison are not identical. This is the only one of the three comparisons which requires an 
exact quantification. 

For the B » A problem, once the child has overcome the tendency to compare A with A’, he 
could as Piaget says follow the chain of reasoning B = A+A', A= B—A!, A! € 0..B» A; or he 
could merely code B and A as respectively ‘a lot’ and ‘a few’ and thence infer that B > A. We 
know that young children use such relative codes and inferences with considerable facility 
(Bryant, 1974). Such a procedure would also be vulnerable to the difficulties caused by size and 
gestalt qualities of arrays shown in the second experiment reported here. 

It may be seen that the three constituent problems of class inclusion as posed by Piaget, and 
by this and other irivestigations following him, make very different demands on children's 
knowledge, attention and processing capacity. To answer correctly such a question ‘if I took 
away the A’, what would be left?’ all that is necessary is to identify A! and A and imagine the 
removal of the former. It is not necessary to relate either A or A! to B; the cognitive demands of 
the problem are extremely simple. To deal with the B >A problem the child has to identify B 
and A and estimate their sizes, and has to overcome an expectation that the two articles in a 
comparison are different. As is to be expected, using two identical sets and comparing the B of 
the first with the A of the second (Inhelder et al. 1974), and using arráys where B is readily seen 
as an entity (Markman, 1973) both decrease the difficulty of the problem. The demands which 
the B = A+A! problem adds are of the identification of A’, the relation of B, A and A! to each 
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other and an exact quantification of B and A+A!. For this alone of the three constituent 
problems of class inclusion Piaget’s postulated operations are probably necessary as well as 
sufficient; both A = B—A! and B > A can be reached by simpler means. 

The model of children's performance on the class inclusion problem outlined above accounts 
for certain of the peculiarities of behaviour found in these samples and in other studies 
(Kohnstamm, 1968; Trabasso, 1975, personal communication). It may not be possible to rule out 
operational inadequacies as a cause of unsuccessful performance (Brainerd & Kaszor, 1974), but 
the results of these experiments have shown up so many situational sources of difficulty, and 
such a non-Piagetian order of achievement of operations that the operational model requires 
some amendment, and the class inclusion problem further analysis. 
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Social class and communication style: The ability of middle and working 
class five year olds to encode and decode abstract stimuli 


Rhona Poole Johnston and C. H. Singleton 


This study investigated whether five year old children would show similar social class differences in coding 
style to those found in Heider’s (1971) study of ten year old children. Fifty-six middle and working class 
children aged five were asked to encode an abstract stimulus from an array of six similar items, so that 
another child present could identify the target item using only the information given by the encoder. Middle 
class children were found to make significantly greater use of a part-descriptive coding style, whereas 
working class children made a significantly greater use of a whole-inferential style; the findings were 
congruent with those of Heider on ten year olds. These differences were found to hold for a subsample of 
children matched on verbal intelligence, and, furthermore, the use of the various coding styles showed no 
significant correlation with verbal intelligence. The origins and implications of these social class preferences 
in coding style are discussed. 


Although of considerable interest, grammatical comparisons of the speech of middle and working 
class children do not yield information on the adequacy of the speech as a vehicle for 
interpersonal communication. If, as Bernstein (1970) has maintained, the working class child has 
the same tacit understanding of the linguistic rule system as the middle class child, then it can be 
argued that grammatical differences may play a relatively minor part in influencing the child's 
educational success or failure. By contrast, the way in which the child chooses to use his 
language to describe his world, to serve his personal needs, and to relate to others, may be a 
critical factor in determining the efficacy of the educational process. 

Halliday (1973) has suggested a classification scheme in which the social functioning of 
language is reflected in the linguistic structure, and he has identified in the speech of a young 
child a number of uses of language - the instrumental, the regulative, the interactional, and so 
on. This idea was expanded by Tough (1974), who analysed the speech of young children 
according to its use, finding marked social class differences in the frequency with which certain 
functional categories of language were used. The middle class children used language more for 
predicting, for collaborating in action, and for projecting through the imagination to a scene 
beyond the present context, whereas the working class children used language more often to 
secure attention for their own needs, to maintain their own needs in competition with the needs 
of others, and to monitor their own actions. A functional analysis of modes of speech was also 
carried out by Williams & Naremore (1969), who showed that while working class children's 
speech was more context-centred, middle class children made greater use of topic-centred 
speech and made less reference to context or particularistic experience. The authors argued that 
social class differences such as these originate from the type of communication required of the 
child in his social interactions, and that these different communication demands are a function of 
the different social class backgrounds in which the children are reared. 

However, these studies do not allow measurement of the effectiveness of these different 
types of communication, so that although the functions of the language may differ according to 
social class, no comparison can be made of the effect of these differences on interpersonal 
communication. 

Another approach has been the use of the 'coding' situation, where two people carry out a 
formal communication task such that one person encodes the target referent in order to make it 
identifiable by another person who cannot see the stimulus, and who must then select the correct 
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referent from a number of non-referents. In this formal situation it is possible to examine not 
only the communicative effectiveness of the encoder’s message by measuring the success rate of 
the decoder, but also examination can be made of the content of the message. Early studies of 
this type tended to rely on the former method of analysis: Glucksberg, Krauss & Weisberg (1966) 
studied referential communication in nursery school children using the coding technique, and 
assessed the quality of the communication according to whether the decoder could achieve the 
performance criterion. It was found that children aged 52-63 months could not perform the task 
with novel referents, although they could with familiar ones, but children of this age could 
perform correctly on the novel tasks if they were given either messages encoded by adults, or 
their own messages. Krauss & Rotter (1968) found that effectiveness of communication in this 
task varied both with social class status and age; the twelve year old boys were superior to the 
seven year old boys as speakers and listeners, and the middle class boys were superior to the 
working class boys in both speaking and listening. 

Further analysis can be made, however, in the coding situation, if the content of the encoder’s 
message is examined. Brent & Klamer (1967) found that college students tested on geometrical 
figures named the figures, whereas ‘inner-city’ adolescents did not. However, this finding 
probably indicates differences in background knowledge, rather than social class differences in 
coding style. Heider (1971) made a major analysis of the different styles of coding used to 
describe abstract figures by ten year old middle and working class children. The encoder was 
shown five arrays of abstract figures and five arrays of line drawings of faces; in each case he 
was asked to describe the target stimulus so that another child of the same age would be able to 
pick out that one from a number of others, using only the information provided by his 
description. Two dimensions were differentiated in the speech recorded from the encoders, the 
whole-part and the inferential-descriptive dimensions. The whole-part distinction refers to 
whether the child’s statement described the whole of the stimulus figure or part of it, and the 
inferential-descriptive distinction refers to whether the child described the stimulus metaphori- 
cally (inferential) or whether an abstract account was given of the physical properties of the 
stimulus (descriptive). It was found that the middle class encodings were significantly more often 
part-descriptive statements (e.g. ‘It has a point on top and a point at the bottom’) and the 
working class encodings contained a significantly greater use of whole-inferential descriptions 
(e.g. ‘It looks like a ‘star’). In decoding, Heider found that the middle class decoders were 
superior to the working class decoders, and that each social class group was better at decoding 
the style of encoding used predominantly by their own group. 

It could be argued that these social class differences stemmed from factors that were not 
controlled for in Heider’s study. According to Bernstein (1970), children from middle class 
homes tend to come from predominantly person-orientated families, and children from working 
class homes in general come from position-orientated families; Bearison & Cassel (1975) have 
shown with six year old children that on five measures comparing the form and content of 
messages conveyed to sighted and blindfolded listeners (the latter situation corresponding to the 
experimental situation used by Heider in that the listener’s perspective is not the same as that of 
the decoder), the children from person-orientated families accommodated their communication to 
the listener’s perspective more than the children from position-orientated families. In addition, 
Cox (1975) has found that children are better able to coordinate the perspective of another 
person if the actual person is present. Thus it may be the case that the working class children in 
Heider’s study used a different coding style because they were less able to accommodate 
themselves to the perspective of the decoder, whose existence they had to imagine. Another 
variable that Heider did not investigate was the effect of verbal intelligence on coding styles; the . 
findings would seem to have more general relevance if it could be shown that they were 
independent of verbal intelligence levels and therefore attributable to differences in cultural 
linguistic style rather than in information-processing ability. 
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The present study was conducted to investigate whether Heider's findings could be replicated 
in a study of British middle and working class five year olds, who would perform a similar task 
to Heider’s, but the experimental situation would differ in that the decoder would be present and 
would be seen by the encoder to be actively guessing the target stimulus. It is argued that if the 
working class dependency on the whole-inferential approach to the task in Heider's study was 
due to the fact that the decoder was not present, then the working class children in this study 
should not differ significantly from the middle class children in the coding styles they utilize. The 
study was also designed to investigate whether the verbal intelligence of the encoders and 
decoders would affect the findings, or whether the different use of coding styles by the two 
social class groups would be independent of verbal intelligence levels. A subsample, using a 
matched-pair design on verbal intelligence, was therefore constructed by selecting a number 
of children from the original sample of children, who had been left unmatched for verbal 
intelligence. It was predicted that if verbal intelligence was a factor influencing the use of 
coding styles, the encodings from the middle and working class children in the matched-pair 
design would be less likely to differ significantly in style. 


Method 
Subjects 


A sample of five year old children was drawn from each of two schools, one of which was situated on a 
Council Estate in Hull (constituting the working class sample), the other being a private school in a 
residential area outside Hull (constituting the middle class sample). Each child's social class status was 
assessed by referring the father's occupation to the Registrar General's classification (1970). Initially, 46 
children were tested at the school in Hull; 21 of these had fathers in semi-skilled occupations (Class IV 
according to the Registrar General's classification), and the fathers of the remaining 25 children had unskilled 
occupations (Class V). At the school outside Hull, 22 children were tested in the first instance; 9 of these had 
fathers in professional occupations (Class I), while the other 13 children had fathers with senior managerial 
or intermediate professional occupations (Class I). 

All these children were tested using the Peabody Picture Vocabulary Test (PPVT). Although there has 
been some debate on the validity of this test of hearing vocabulary as a measure of verbal intelligence, Dunn 
(1965) quotes several studies of children of primary school age in which correlations in the range of 0-67-0-82 
were reported between the PPVT and established tests of verbal intelligence (e.g. WISC Verbal Scale). In 
this paper, therefore, the term verbal intelligence has been used when referring to PPVT test scores. 

The children at each school were then matched in boy-girl, boy-boy, and girl-girl pairs, so that the verbal 
IQ scores for each pair matched as closely as possible. This resulted in there being 38 children matched in 
pairs in the working class school (32 in boy-girl pairs, and 6 in boy-boy pairs), and 18 children matched in 
pairs in the middle class school (14 in boy-girl pairs, and 4 in girl-girl pairs). The mean IQ of the working 
class children was 92-66, and the mean IQ of the middle class children was 106-94. This difference between 
the IQ scores was statistically significant (F= 12-51, d.f. = 1, 54, P< 0-001). The mean age of the working 
class group was 5 years 4 months, ranging from 4 years 11 months to 5 years 7 months; the mean age of the 
middle class group was 5 years 3 months, ranging from 5 years 0 months to 5 years 6 months. 

From this sample a subsample was selected so that each middle class pair could be matched on age and 
verbal intelligence with a working class pair. This resulted in five working class boy-girl pairs being matched 
with five middle class boy-girl pairs, so that ten middle class and ten working class children constituted this 
subsample. The mean IQ of the matched working class sample was 98-5, and the middle class mean IQ was 
99-8. This difference was not statistically significant. The mean age of the working class subjects was 5 years 
2 months, ranging from 5 years 1 month to 5 years 4 months; the mean age of the middle class subjects was 
5 years 4 months, ranging from 5 years 2 months to 5 years 6 months. 


Materials 


The experimental materials consisted of ten arrays of abstract figures, six of the stimulus items in the arrays 
being drawn from the stimuli described by Heider (1971). These six items had been employed originally 
because of their ‘low codability’, that is, because they were found to be difficult to name. The other four 
stimulus items were slightly more codable, and were drawn up in order to introduce the task to the children; 
it was decided that these practice arrays should consist of abstract figures so that the children would notform 
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the idea that the task involved the simple naming of the pictures. There were two copies of each array, one 
having the. target stimulus outlined in red ink for the encoder, the other being unmarked for the decoder. The 
position of appearance of the target stimulus was randomised across the six possible positions of occurrence 


in the ten arrays. Figures 1 and 2 show examples of two of the experimental arrays; the items outlined in 
bold print indicate in each case the target stimuli. 


Figure 1. Experimental array. 
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Figure 2. Experimental array. 
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Procedure 


The subjects were tested in the pairs to which they had been allocated, each pair being tested together in an 
experimental room in which they were seated opposite each other, with a screen placed between them to 
obscure the stimulus arrays with which they were presented. The position of encoder or decoder was 
assigned so that half of the girls and half of the boys encoded first in the boy-girl pairs, and in the other 
pairs the position of the encoder was assigned randomly. The same pair would be asked to carry out the task 
again on another occasion so that the encoder could then act as the decoder, and vice versa. 

Copies of the first of the four practice arrays were then handed to each of the children, the encoder 
receiving the array with the target stimulus outlined in red. The encoder’s attention was drawn to the fact 
that one of the stimuli was outlined in red, and was shown that the other child’s array did not contain this 
feature. The encoder was then asked to tell the other child about the outlined picture so that he could guess 
which one was being talked about. When the encoder stopped talking, he was given one prompt by being 
asked ‘Is there anything else you can tell - (decoder's name) about it?’ When the four practice arrays had 
been presented, the experimental arrays were introduced. This time the encoder’s responses were recorded 
on a tape-recorder, and the decoder’s responses were scored by the experimenter according to whether the 
guess was correct or incorrect. The encoder’s responses were later transcribed and categorized according to 
Heider's analysis. 

Heider describes two dimensions of classification, the whole-part and the inferential-descriptive. A 
statement was classified as ‘whole’ if it referred to the whole of the design, or ‘part’ if it referred to part of 
the figure; on the other hand, the statement was designated as ‘inferential’ if the picture was described 
metaphorically in terms of an object, or ‘descriptive’ if the physical properties of the design were described. 
Combining the two dimensions gives rise to four possible categories, examples of which are given in 
parentheses. (These examples are taken from actual protocols elicited from the experimental stimulus shown 
in Fig. 1): (1) whole-inferential (‘it looks like a body’), (2) whole-descriptive (‘it’s got seven sides’), (3) 
part-inferential (*it looks like some sleeves hanging down"), and (4) part-descriptive ('it's got a curve at the 
top’). Descriptions in the second category, however, were very infrequent both in Heider's analysis and in 
this study. 


Results 


In the unmatched sample, differences in (a) encoding style, and (b) decoding ability, were 
analysed by means of separate one-way analyses of variance for both social class and sex. The 
scores for the use of each style were expressed as a proportion of the total number of coding 
units used by each child and subjected to an arcsin transformation to normally distribute them 
(Winer, 1962). It was found that the working class group used the whole-inferential style of 
encoding significantly more often than the middle class group (F= 9-34, d.f. = 1, 54, P« 0.01), 
and that the middle class group made significantly greater use of the part-descriptive style 
(F= 18-95, d.f. = 1, 54, P« 0-001). No significant difference was found in the use of the 
part-inferential style, and the occurrence of responses in the whole-descriptive category was too 
small to allow any analysis. No significant sex differences were found for either encoding style 
or decoding ability, nor were any significant social class differences found in decoding ability. 
It could be argued from this analysis that as the verbal IQ scores of the two social class 
groups differed significantly these results could be due to the uncontrolled effect of verbal 
intelligence. It was for this reason that a matched-pair design on verbal intelligence was 
constructed from the unmatched sample so that each middle class pair was matched with a 
working class pair with similar verbal IQ scores. Two-way factorial analyses of variance were 
then carried out on the coding styles used by the children, social class and sex being the two 
factors examined. It was found that the working class group used the whole-inferential style 
significantly more often than the middle class children (F= 5-02, d.f. — 1, 16, P< 0-05), and the 
middle class group made greater use of the part-descriptive style (F= 10-47, d.f. — 1, 16, 
P« 0-01). No significant difference was found in the use of the part-inferential style, and again 
the use of the whole-descriptive style proved too infrequent to permit analysis. No differences in 
the ability to decode the messages were found between the two social class groups. 
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Tt could be argued that a thorough test had not been made of the effect of verbal intelligence 
on choice of coding style because in order to create a matched sample it had been necessary to 
select subjects from a very narrow IQ range. Consequently, a correlational analysis (Pearson's r) 
was carried out in order to determine the extent of the relationship between verbal IQ and the 
use of the various coding styles in the total sample. However, none of these correlations proved 
to be statistically significant, indicating that verbal intelligence was not a factor influencing the 
findings. 


Discussion 


Whilst it could be argued that the samples used in this investigation were relatively small, 
nevertheless the results clearly support Heider's (1971) finding. The working class children used 
the whole-inferential mode of expression significantly more often than middle class children, and 
the middle class children made greater use of the part-descriptive style; the manipulation of the 
experimental situation so that the decoder was present did not produce any difference in the 
preferred style of encoding of the working class children. Also in agreement with Heider's 
results is the fact that no significant sex differences were found in either coding style or decoding 
ability. The construction of a matched-pair design on verbal intelligence did not affect the 
direction of the findings from the larger sample left uncontrolled on this variable, and it was 
demonstrated that there was no relationship between the verbal intelligence scores of the total 
sample of children and their use of the whole-inferential, part-descriptive, and part-inferential 
coding styles. 

These findings do not, however, completely refute the suggestion that the working class mode 
of encoding may be due to an inability to accommodate to the listener's perspective. Although 
the encoder was able to see that the other child was listening to the message and was in the first 
instance shown the array that was placed before the decoder, it is still possible that the working 
class child could not orientate himself to the decoder's perspective. However, to argue this 
interpretation it would be necessary to show that the working class children's encodings were 
less effective in conveying the identity of the target stimulus to the listener. One way of doing 
this is to compare the success rate of the middle class and working class decoders. No significant 
social class differences were found in this study, however, in the ability to decode the messages. 
The mean correct score for the total working class sample was 3-21 (out of 6) and the mean 
score for the middle class children was 2-83. Since a score of only 1 out of 6 would be expected 
by chance in this task, it is clear that some information is being successfully transmitted. 
However, this does accord with Glucksberg et al.'s (1966) finding that children of this age are 
not very proficient at decoding descriptions of abstract stimuli encoded by other children. 

Further experimentation would have to be carried out on older children able to decode such 
messages in order to establish whether working class encodings are any less effective than those 
of middle class children, independently of verbal intelligence test scores. However, it is 
necessary to distinguish between the adequacy of the encoder's message and the ability of the 
decoder to interpret it; Glucksberg et al. also showed that a substantial number of the children 
aged 46-63 months that they tested could decode the message about a novel referent if they were 
given reference phrases formulated by adults. It would have been of great interest in this study 
to have investigated whether middle class children were able to decode working class encodings, 
and vice versa, but unfortunately this was not possible to arrange. 

The fact that the findings in this study are in agreement with Heider's is of great interest. 
Heider's subjects were not only aged ten, five years older than the children of this study, but her 
sample included black working class children. Considering the age and racial differences, not to 
mention the difference in cultural background between British and American children, it is 
remarkable that Heider's results were so closely replicated in this study. It would seem therefore 
that a close examination of the styles of encoding which emerged is in order because of the 
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consistency of this finding in European and American children, its developmental consistency, 
and the fact that it seems to be unrelated to verbal intelligence levels. The suggestion that the 
use of different styles may be due to working class difficulty in coordinating perspectives does 
not detract from the educational importance of these findings, although it would be of great 
interest to establish whether this was a variable affecting the findings. However, all children have 
to function in a similar school situation, regardless of social class, and difficulties with the social 
psychology of this situation are probably as important factors in the success of the child in the 
school as differences in cognitive approach to communication tasks and the resulting style of 
speech used. 

In order to illustrate the differences between the two coding styles, an example of a middle 
class and of a working class encoding will be given. The following is an example of a middle 
class boy's (verbal IQ 112) encoding of the second experimental stimulus (see Fig. 2). ‘Wel, it's 
got a straight line, then a round, straight line, and a round, like a straight line, but it's got a 
bump, and a straight line, and a straight line, round, and like a bump again, and then straight.' A 
working class boy (verbal IQ 120) encoded the same stimulus as ‘It looks like an apple stand and 
there's balls on it'. (It should be pointed out that all the children used a mixture of all four 
possible coding styles, but the proportions used of each varied from child to child and between 
social classes.) The middle class boy's approach seems to be more analytical than the working 
class boy's; his strategy was to start at one point of the abstract design and to describe it in 
detail, progressing around the form until he reached the starting point again. The working class 
boy, however, produced a global description, using a metaphor to describe what the figure 
looked like. It would be difficult to argue that the middle class child responded with a more 
effective message than the working class child; it could be argued that the working class child's 
encoding was potentially more informative as it gave a strong visual impression of the stimulus, 
whereas the middle class child's description would be incomprehensible to a listener who could 
not see the stimulus. Indeed. Heider, Cazden & Brown (1968) report in their study of ten year 
olds that ‘wholistic’ images were more successfully decoded than ‘analytic’ images if the social 
class of the encoder and decoder was ignored. 

It was noticeable that the middle class children produced descriptions using ‘this’, ‘there’, and 
‘that’, where the referents of these words could only be understood by seeing what the child 
was pointing to, almost as frequently as the working class children. In the matched sample, these 
items of exophoric reference accounted for 4-5 per cent of the total words used by the middle 
class children, and 5-9 per cent of those of the working class children; this suggests that even 
middle class children of this age are not very adept at accommodating to the viewpoint of the 
decoder. 

The most crucial difference, then, seems to be the middle class children's tendency to describe 
each part of the design, often in minute detail, whereas the working class children predominantly 
select a description which applies to the whole picture, which includes comparing it to objects 
they see in their everyday life. To this extent, the working class children's responses are more 
dependent on the decoder having shared experience of the same objects. The middle class 
children's encodings are less dependent on shared experience in this respect, but are more 
context bound in that their descriptions tend to be difficult to understand in the absence of the 
referent. 

It is important to reflect on the fact that the use of different coding styles in this study was not 
related to verbal intelligence levels. If it were also shown that the use of these styles is unrelated 
to non-verbal intelligence, then it could be argued that the differences found may be due to 
differences in cognitive style, rather than differences in ability. Hess & Shipman (1965) have 
argued that social class differences in problem-solving styles are related to the type of teaching 
adopted by the child's mother. Thus it would seem that the socializing of the child by his 
mother, the way she interprets experiences for him, and teaches him to manipulate and interact 
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with the objects around him, will have important consequences for his later mode of cognitive 
functioning, which may then be reflected in his style of encoding. 
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The processing of negative or incongruent perceptual and combined 
perceptual/linguistic stimuli 


Rumjahn Hoosain 





A psychological model of mediational processes based on the derivation of the meaning of perceptual or 
linguistic signs predicts increased processing times for signs that have negative affective meaning and 
conjoined signs with incongruent affective meanings. The procedure used in the experiments enables 
identical responses for positive and negative signs. The postulated effects are shown to be valid for 
perceptual as well as conjoined perceptual and linguistic cognitions, using schematic drawings of facial 
expressions and quotations attributed to persons with the expressions. The findings suggest that perceptual 
and linguistic cognitions share the same semantic systems, although the latter are more specific. 


According to the neo-behaviouristic theory of meaning (e.g. Osgood, 1971) a sign acquires its 
meaning by being associated with behaviour towards things it signifies. Its meaning is considered 
as a process mediating between the sign as a stimulus and response to the sign. This mediating 
process itself is analysable into components, very often bipolar in nature. For example, we have 
the good and the bad of evaluation, the big and the small of potency, and the fast and the slow 
of activity in affective meaning as measured by the semantic differential. This bipolar 
organization of meaning components originates in the reciprocally antagonistic basis of 
behaviours towards external stimuli (e.g. approach vs. avoidance behaviours). 

Arising from this bipolar organization of human experience there are two hypotheses about 
cognitive dynamics: 


Hypothesis I: In either perceptual or linguistic cognition, the processing of signs which have 
positive affective meaning will be faster than that of signs which have negative affective 
meaning. 


It is assumed that affectively negative perceptual or linguistic signs require longer response times 
because their meanings are derived from overt avoidance behaviours towards negative stimuli. 
The hypothesized effect of such an acquisition history is an ‘avoidance of cognizing’ of negative 
signs. 


Hypothesis II: In conjoined cognitions processing times for congruent signs will be faster 
than for incongruent signs. 


Since meanings of signs are derived from behaviour to stimuli, it follows that conjoined 
cognitions involving incongruent signs should result in simultaneously positive and negative 
reaction tendencies. Such cognitive incongruence is assumed to produce a ‘freezing’ and result 
in increased processing time, just as simultaneous tendencies toward approach and avoidance 
behaviours result in tension, conflict, and ‘freezing’. The validity of these two hypotheses in the 
case of linguistically derived cognitions was demonstrated by Hoosain (1973, 1974). The purpose 
of the two experiments reported in this-paper was to test the validity of the same hypotheses for 
perceptual cognitions as well as for mixed cognitions. 

The perceptual stimuli (signs) used were outline facial pictures (see Fig. 1) taken from 
Cuceloglu (1967). These facial expressions were generated by the use of different eyebrow types, 
eye types, and mouth types. Cuceloglu’s subjects were asked to rate each face on 40 emotion- 
name scales (e.g. slightly/somewhat/very like/unlike anger). The data were factor analysed, 
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Figure 1. Positive and negative facial expressions. 


yielding three bipolar dimensions identified as pleasantness, activation and control (comparable 
to evaluation, activity, and potency in semantic differential studies), accounting for about 70 per 
cent of the variance. These facial expressions provided appropriate visual stimuli for the present 
study since they have independently tested affective meanings. 


Expertment I 

Faces +1 to +6 in Fig. 1 have positive loadings on the pleasantness dimension and faces —1 to 
—6 have negative loadings. The 12 faces were each used twice to form 12 pairs of facial 
expressions, three pairs for each of the following polarity conditions: 

(1) Congruent Positive (+1/+2; +3/+4; +5/+6): for these three pairs the constituent faces had 
positive loadings on the pleasantness dimension; they had identical mouth features and one 
difference in either the brows or the eyes features. 

(2) Congruent Negative (—1/|—2; —3/—4; —5/—6): the constituent faces had negative loadings 
on the pleasantness dimension; they had identical mouth features and one difference in either the 
brows or the eyes features. 

(3) Incongruent, (+1/-1; +2/—6; +5/—5): the constituent faces had opposed-sign loadings on 
the pleasantness dimension; they had different mouth features but identical brows and eyes. 

(4) Incongruent, (+3/—3; +4/—2; +6/—4): the constituent faces had opposed-sign loadings on 
the pleasantness dimension and at least one other dimension; they had different mouth features 
as well as differences in one or both of the brows and eyes features. 

These pairs of facial expressions were presented on slides, the constituent faces side by side 
with an underlined blank (——) between them. One group of subjects had the faces paired in one 
left-to-right order and a second group of subjects had them in the reversed order. This was done 
to test for possible ordering effects. Thus for the Incongruent, condition one group saw the 
pleasant faces on the left (+3/—3, +4/-2, and +6/—4) and the other the unpleasant on the lest 
(—3/+3, —2/+4, and —4/+6). The two orderings will be referred to as order 1 and order 2 
respectively. Four additional slides, one representing each of the four polarity conditions, were 
used as practice items. 


Subjects 


Two groups of 14 female subjects each were used. They were introductory psychology students at the 
University of Illinois, and their participation in the experiment was a course requirement. 
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Procedure 


Subjects were run individually. The apparatus was designed in such a way that, when each slide with a pair 
of facial expressions was projected on a screen a photocell was activated; this in turn activated a timer. A 
voice-key was attached to the circuit so that the timer was stopped when the subject responded with saying 
and|but into a microphone. 

Subjects were asked to imagine that each pair of presented faces represented two different people reacting 
to the same situation with their respective feelings and expressions. The faces either showed positive or 
negative kinds of feelings, and subjects were to respond with saying and or but to conjoin the faces, 
depending on whether they thought the respective expressions were congruent or incongruent. They were 
then shown the four practice slides followed by the 12 experimental slides in random order, with the 
constraint that no two slides that had an identical constituent face were presented consecutively. Fourteen 
subjects were shown the 12 pairs of faces in order 1 and the other 14 were shown the same pairs of faces in 
order 2. They were asked to look at the presented slides from left to right. Their response latency for each 
presented slide was recorded; as well as whether their actual response was and or but. 


Results 


Before analysing the results a correction was made for the articulatory and acoustic differences 
between saying and and saying but. The latency difference was estimated by Hoosain (1973) to 
be 36 msec, with and faster than but. Therefore, 36 msec were subtracted from each of the but 
response times (for each actual but response, regardless of whether but was the expected 
response according to polarity condition). It should be noted that this correction works against 
Hypothesis II — that incongruent cognitions require more processing time. One response time that 
was over 5 sec was discarded from the analyses. 

And or but responses which differed from what were expected by the experimenter in terms of 
the predetermined affective meanings of the respective conjoined faces (and for congruent 
pairings and but for incongruent) will be referred to as ‘errors’. The total number of ‘errors’ for 
order 1 was 7 (out of 14x3 = 42 responses = 17 per cent) for the congruent positive condition, 

8 (19 per cent) for the congruent negative condition, 11 (26 per cent) for the incongruent, 
condition, and 12 (29 per cent) for the incongruent, condition; for order 2 there were 10 (24 per 
cent) ‘errors’ for the congruent positive condition, 2 (5 per cent) for the congruent negative 
condition, 14 (33 per cent) for the incongruent, condition, and 7 (17 per cent) for the 
incongruent, condition. A two-way analysis of variance with repeated measures, with ordering 
(order 1 vs. order 2) and polarity condition as the two main factors, showed no significant main 
or interaction effects. In the following analyses, all response latencies, including those called 
‘errors’ were included. The latter were only responses that differed from the experimenter’s 
expectations, not that there was anything wrong with the data. Subjective 
congruence/incongruence judgements should be contrasted with judgements of objective facts, 
such as when subjects decide if a canary is a bird is true/false, A is above/below B, or if a circle 
and an oblong are same/different (e.g. Clark and Chase, 1972; Seymour, 1973). In objective 
judgements there is something wrong with unexpected responses. 

Figure 2 shows the average response latencies across the 14 subjects in each group as a 
function of affective polarity of the conjoined perceptual signs. Each subject’s response times 
for the three presented slides of the same polarity condition were averaged for analyses. A 
two-way analysis of variance with repeated measures was made on these averaged response 
times, with four levels of the polarity factor and two levels of the ordering factor. 

There was no significant difference between the two presentation orderings, but a significant 
polarity effect (F= 5-3, d.f. 23, 78, P<0-01), and no significant interaction between the two 
main effects. The fact that there was no left-to-right ordering effect indicated that for the two 
incongruent conditions it did not matter whether the positive sign was to the left of the negative 
sign and thus processed first, or the reverse. Subjects could, of course, alternately look at the 
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Figure 2. Response latencies for perceptual signs. 


two presented faces in quick succession before deciding on their congruence/incongruence 
relation. 

The data for the two orderings were then collapsed for the following three orthogonal planned 
individual comparisons. (1) Congruent cognitions, with expected and responses (congruent 
positive together with congruent negative) were compared with incongruent cognitions, with 
expected but responses (incongruent, together with incongruent,); there was a significant 
difference, with the mean for congruent cognitions being 1-815 sec and the mean for incongruent 
cognitions being 1-961 sec (F= 4-81, d.f. = 1, 52, P< 0-05). However, as can be seen from Fig. 2, 
this was entirely due to the ease of congruent positives. (2) Comparing the two congruent 
conditions, congruent positive had a mean of 1-663 sec and congruent negative had a mean of 
1:966 sec (F= 10-38, d.f. =1, 52, P< 0-005). (3) Comparing the two incongruent conditions, 
incongruent, had a mean of 1-909 sec and incongruent, had a mean of 2-013 sec. The difference 
was not significant (F= 1-23, d.f. = 1, 52). 

To summarize, both Hypothesis I and Hypothesis II were supported for perceptual (facial) 
signs. As can be seen from Fig. 2, response latencies for the congruent negative condition and 
the two incongruent conditions were in the same general range, all being longer than those for 
the congruent positive condition. For the congruent negative condition there were two affectively 
negative perceptual signs to be processed, for each of the incongruent conditions there was one 
negative perceptual sign plus the incongruence relation between the conjoined signs, whereas 
neither negative signs nor incongruence were involved in the congruent positive condition. 
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Experiment II 
Materials 


Each presented slide consisted of two parts, a perceptual stimulus (one of the 12 facial expressions displayed 
in Fig. 1) and a linguistic stimulus (one of the 12 sentences listed in Table 1). The 12 sentences were in pairs 
with antonymous quotations, one being clearly positive and the other clearly negative. Each of the six 


Table 1. Linguistic stimuli for cross-channel cognition 











Positive Negative 

(1) Tom said, ‘I won the bet’. (1) Tom said, ‘I lost the bet’. 

(2) Jack said, ‘I passed the exam’. (2) Jack said, ‘I failed the exam’. 

(3) Paul said, ‘I loved the food’. (3) Paul said, ‘I hated the food’. 

(4) Ann said, ‘My proposal was accepted’. (4) Ann said, ‘My proposal was rejected’. 
(5) John said, ‘They admired my work’. (5) John said, ‘They criticized my work’. 
(6) Jim said, ‘Business is fine’. (6) Jim said, ‘Business is lousy’. 








positive facial expressions was paired with one of the six positive sentences to yield six congruent positive 
conjoined cognitions (the actual pairings, in face-sentence ordering, were +6/+1; +2/+2; +4/+3; +5/+4; 
+3/+5; and +1/+6), and then with one of the six negative sentences to yield six incongruent, conjoined 
cognitions (-6/—1; +2/—2; +4/-3; +5/—4; +3/—5; and +1/—6). The six negative facial expressions were 
paired with the six negative sentences to yield six congruent negative conjoined cognitions (—6/—1; —2/—2; 
—4/-3; —5|—4; —3/-5; and —1/—6), and then with the six positive sentences to yield six incongruent, 
conjoined cognitions (—6/--1; —2/+2; —4/+3; —5/+4; —3/+5; and —1/+6). 

Each presented facial expression was labelled with the name of the speaker of the conjoined quotation. 
Thus, a total of 24 pairs of conjoined cross-channel stimuli were obtained. Each of the facial expressions and 
each of the sentences appeared in two combinations, once congruently with another stimulus and once 
incongruently with another stimulus. Two sets of 24 slides each were made. The first set had the linguistic 
stimuli on the left and the perceptual stimuli on the right, an underlined blank (— separating the two (this 
will be referred to as order 1). The other set had the same conjoined stimuli with the left-to-right ordering 
reversed (referred to as order 2). For each of the two orderings, four practice slides were made, one 
belonging to each of the four polarity conditions. 


Subjects 


The same types of subjects as in Expt. I, two groups of 14 each, were used, one being shown the conjoined 
stimuli in order 1 and the other being shown order 2. 


Procedure 


Subjects were run individually. The same apparatus and the same procedures as in Expt. I were used. 
Subjects were asked to imagine that each slide showed something being said and simultaneously the 
expression of the speaker. They were to respond with saying and or but to fill in the blank conjoining the 
sentence and the facial expression according to whether they thought the two sets of information were 
congruent or incongruent - that is, whether they expected someone making the quoted statement would be 
likely to have the given expression (for order 1, and conversely for order 2). Subjects were also asked to 
look at the slides from left to right — hence, for order 1, reading the quotation first and, for order 2, looking 
at the facial expression first. 

Subjects were then shown the four practice slides, followed by the 24 experimental slides in random order, 
with the constraint that no two slides that had identical linguistic or perceptual constituents were to be 
presented consecutively. The response latency for each presented slide was recorded, as well as whether the 
response was and or but. 
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Results 


Eight response times which were over 5 sec (constituting 1-2 per cent of the data) were discarded 
from the analyses, and the same correction as made in Expt. I - 36 msec from each actual but 
response - was performed here as well. The total number of ‘errors’ (saying but when and was 
expected, and vice versa) for order 1 was 6 for the congruent positive condition (out of 

14x6 = 84 responses = 7-2 per cent), 11 (13 per cent) for the congruent negative condition, 5 (6-0 
per cent) for the incongruent, condition, and 6 (7-2 per cent) for the incongruent, condition; for 
order 2, the total ‘errors’ were 11 (13-1 per cent) for the congruent positive condition, 8 (9-5 per 
cent) for the congruent negative condition, 4,(4-8 per cent) for the incongruent, condition, and 3 
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Figure 3. Response latencies for cross-channel signs. 


(9-5 per cent) for the incongruent, condition. The mean percentage of ‘errors’ was 9 per cent for 
all the responses in this experiment (vs. 21 per cent for Expt. I). A two-way analysis of variance 
with repeated measures, with stimulus ordering and polarity condition as the main effects, 
showed no significant differences. 

Figure 3 shows the average response latencies across the 14 subjects in each group as a 
function of polarity condition. Each subject’s response times, including ‘errors’, for the six 
presented slides of the same polarity condition were averaged for analyses. A two-way analysis 
of variance with repeated measures was made on these averaged times, with four levels.of the 
polarity factor and two levels of the stimulus ordering factor. There was a significant ordering 
effect (F= 5-03, d.f. = 1, 26, P<0-05); as can be seen from Fig. 3, order 1 (with sentences 
presented to the left of the faces and presumably processed first) had consistently shorter 
response latencies than order 2 (with the faces presumably being processed first). For all polarity 
conditions; the difference was consistently of the order of about 330 msec. 

The main effect of polarity condition was also significant (F= 3-40, d.f. =3, 78, P< 0-05). The 
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following three orthogonal planned individual comparisons were made. (1) Comparing the 
congruent conditions (congruent positive together with congruent negative) with the incongruent 
conditions (incongruent, together with incongruent,) there was a significant difference (F =6-63, 
d.f. =1, 104, P< 0-025); the mean for congruent cognitions was 1:932 sec and the mean for 
incongruent cognitions was 2:007 sec. However, as can be seen from Fig. 3, this effect was due 
to the speed with which congruent positives were processed (as in Expt. I). (2) Comparing the ` 
two congruent conditions, congruent positives had a mean of 1-861 sec and congruent negatives 
had a mean of 2-003 sec, significantly different at the 0-001 level (F= 11-69, d.f. — 1, 104). 

(3) Comparing the two incongruent conditions, incongruent, had a mean of 1-978 sec and 
incongruent, had a mean of 2-037 sec. This difference, however, was not significant (F= 2-05, 
d.f. = 1, 104). 

As was the case in Expt. I, response latencies for the congruent negative condition and the 
two incongruent conditions were in the same general range, all being longer than those for the 
congruent positive condition. But, again, there were two affectively negative conjuncts (one 
linguistic and one perceptual) for the congruent negative condition, and for each of the 
incongruent conditions there was one negative conjunct (the facial expression for incongruent, 
and the quotation for incongruent,) together with the incongruence relation between the con- 
joined cognitions. 


Discussion 


As far as the two hypotheses are concerned the two experiments showed similar results for 
conjoined perceptual cognitions and conjoined cross-channel cognitions: affectively negative 
signs required longer processing times than affectively positive signs, and incongruent cognitions 
required longer times than congruent cognitions. In Expt. I it was assumed that subjects judged 
the congruence/incongruence of conjoined facial expressions by comparing the affective tones of 
the total facial expressions rather than by comparing the identity/non-identity of their constituent 
facial features (eyes, brows, or mouth). The latter process would reduce the task to simple 
matching of visual patterns, similar to that studied by Posner & Mitchell (1967) or Olson (1973). 
Cuceloglu (1972) showed that the mouth feature played a distinctive role for the pleasantness 
dimension. But there were indications that subjects in Expt. I could not have been simply 
comparing identity/non-identity of specific mouth or other features. 

Each pair of facial expressions in the congruent positive, congruent negative, and the 
incongruent, conditions had just one pair of features which were not identical, either the eyes, 
the brows, or the mouths. A simple search for match or mismatch of any possible pair of facial 
features should not result in differential response latencies for the three conditions. Assuming 
that subjects attended to the dominant mouth feature only, this still cannot explain the difference 
between the congruent positives and the congruent negatives; they both had identical mouths in 
their respective pairs of facial expressions. 

Some subjects reported using labels for the presented facial expressions (e.g. satisfied, angry), 
both in Expt. I and Expt. I, also indicating that they perceived and responded to total 
expressions řather than constituent facial features. It was less likely that subjects excluded other 
perceptual cues and attended to just specific features, given the frequency of perceptual closure 
of the visual form of the face as a whole - we see whole faces more often than isolated facial 
features. The role of these features in facial expressions is analogous to that of morphemes 
within multi-morpheme words. Although these morphemes are high-frequency fragments, occur- 
ring in many words (e.g. fide in confide, fidelity, confidential, etc.), they do not have unique 
meanings as do words in which they occur. Osgood & Hoosain (1974) showed that such 
non-word morphemes have higher recognition thresholds than the words in which they occur or 
shape-matched words, despite much higher usage-frequencies of the former. In other words, 
bound elements-have lower perceptual salience than unbound wholes. 
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The fact that similar results were obtained for perceptual as well as cross-channel conjoined 
signs strongly suggests that perceptual and linguistic cognitions share the same semantic system, 
interacting freely. However, results in the experiments did suggest some differences between 
perceptual and linguistic cognizing. More ‘errors’ were found in Expt. I, where both conjoined 
cognitions were perceptual, than in Expt. II, where one of the conjoined cognitions was 
linguistic. It seems that linguistic signs are less ambiguous than perceptual signs, and the - 
former’s presence in Expt. II helped reduce ‘errors’. This can be compared with the finding of 
low reliability in labelling but high reliability in affect-scaling for facial expressions (Osgood, 
1966). 

In Expt. II the linguistic-perceptual ordering had consistently faster processing times than the 
perceptual-linguistic ordering, although identical constituent stimuli were used. With order 1, the 
prior processing of the linguistic materials created a more specific set for the subsequent 
cognizing of the more ambiguous outline faces, thus facilitating comparison of the two cognitions 
and speeding the eventual conjoining (and/but) response. On the other hand, with order 2, the 
prior perceptual forms could only arouse a general affective state to be compared with the 
subsequent linguistic material. In information theory terms, outline faces reduce semantic 
uncertainty less than sentences, and hence interpretation of the facial expressions is more 
constrained by the quotations than vice versa. 

To conclude, there is evidence supporting the two hypotheses presented here, predicting 
increased processing times for negative signs compared to positive signs as well as for 
incongruent signs compared to congruent signs. These hypotheses apply to linguistic signs (cf. 
Hoosain, 1973, 1974), perceptual signs, and conjoined linguistic-perceptual or 
perceptual-linguistic signs. The two hypotheses of cognitive dynamics are derived from the 
neo-behaviouristic theory of meaning. The negative, signs and the incongruent signs are 
hypothesized to be related to behaviours towards negative and incongruent stimulus situations 
respectively. 
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Thinking with restricted language. A personal construct investigation of 
pre-lingually profoundly deaf apprentices 


Angus Gordon 


By means of the deaf manual system of communication, Personal Construct grids were elicited from 
profoundly deaf apprentices. Contrary to expectation, the grids are not uni-dimensional, and they are as well 
elaborated as grids with the same supplied elements and constructs elicited from hearing subjects. 
Implications of the deaf subjects’ grids are considered in the light of the relationship between thinking and 
language explored by Furth. 


In his book Thinking Without Language Hans Furth (Furth, 1966) investigated the thought 
processes of pre-lingually profoundly deaf children. He explored the relationship between their 
linguistic handicap and their apparent ability to think logically despite this. His work not only 
adds to the understanding of the psychology of deafness, but has wider implications for 
psycholinguistics and cognitive psychology. The criticism has been made of this book that it 
would be more appropriately titled ‘Thinking with Restricted Language’ in that most deaf 
children are not entirely without language. Deaf children of deaf parents, about 10 per cent of 
the deaf school population, pick up language naturally. Just as normal children imitate the 
sounds their parents make, so do these children imitate the signs their deaf parents make with 
their hands. At schools for the deaf, despite often rigid repression of this natural language of the 
deaf, other children pick up manual communication from these signing children. 

In a more recent book (Furth, 1973) Furth conceded that the transatlantic equivalent of 
British deaf manual signing, American Sign Language, can be regarded as a language. Conrad 
(1970) and Bellugi, Klima & Siple (1974) have demonstrated that at least one aspect of thinking, 
i.e. short-term memorizing, is carried out by profoundly deaf people in sign rather than acoustic 
coding. Bellugi & Fischer (1972) have also demonstrated that American Sign Language is as 
efficient as spoken English in its rate of conveying information, despite the fact that individual 
words take longer to sign. However, although Furth’s original contention, that deaf children 
demonstrate ‘thinking without language’, is no longer tenable, it still remains true that their 
language is severely restricted. The main restriction on the manual communication of the deaf is 
the relatively small number of available signs. Signs are discrete movements or configurations of 
the hands which correspond to individual words. These are not to be confused with finger 
spelling, where configurations of fingers correspond to the letters of the alphabet. The potential 
vocabulary of the finger speller is of course unlimited. But in addition, to being laboriously slow, 
finger spelling presupposes a degree of literacy which limits its use in the manual communication 
of most deaf people. This study is an attempt to see how the poverty of the available vocabulary 
of signs, affects some aspects of thinking. 

One of the more restricted aspects of deaf manual communication is the paucity of available 
personal adjectives. It is easy to demonstrate such concrete concepts as ‘dog’, ‘man’ and 
‘yellow’ by pointing to various objects comprising each of these classes and then making the 
discrete signs corresponding to these words. But such abstract personal qualities as ‘noble’, 
‘suspicious’ and ‘extravert’ are much more difficult to demonstrate. There is a generally held 
view of the pre-lingually deaf that their linguistic limitations force them to see people in their 
environment in terms of ‘black’ and ‘white’ or ‘good’ and ‘bad’. Translated into the language 
of Personal Construct Theory, this would imply that they have a relatively unelaborated 
unidimensional construct system. 
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Method 


Eight male deaf apprentices in their late teens attending a trade training establishment were interviewed. 
They were all profoundly deaf, and had been so before the aquisition of language. Having left school for at 
least one year, they were all quite proficient in manual communication. All instructions were given to them 
through this medium. 

Eight cards with cartoon drawings were used as elements in a grid. These were: 1. You; 2. Dad; 3. Mum; 
4. Teacher; 5. Girl you like; 6. Girl you don't like; 7. Boy you like; 8. Boy you don't like. 

They were asked to rank these elements along ten supplied constructs. The constructs which are 
represented by single word adjectives comprise a sizeable proportion of the personal adjectives available in 
the deaf manual vocabulary of signs. Of course, some deaf persons can communicate much more complex 
adjectives through the medium of finger spelling, but many have not the necessary literacy to do this. The 
other constructs were descriptive phrases. All constructs were communicated by sign. They were: 


1. like - dislike 

2. lies all the time - speaks the truth 

3. happy - sad 

4. angry all the time - never angry 

5. has many friends — has few friends only 
6. jealous - never jealous 

7. works hard - lazy 

8. worrtes all the time — never worries 

9. clever - stupid 


10. drunk again and again — never drunk 


The same elements and constructs were supplied to eight male nurses in their late teens. As pupil nurses 
they could be judged to have comparable non-verbal i to he deaf apprentices, but of course, considerably 
superior linguistic skills. 


Results 


The grids were analysed by the MRC Grid Analysing Service (Slater). Consensus grids were 
computed for the deaf groups and the hearing control group. The two consensus grids were 
compared on the Delta programme. 

In most grids, the first principal component is usually taken to be the evaluative dimension. 
The hypothesis tested here is that the constructs of the deaf group would cluster round this 
component. But the percentage variance accounted for by component 1 was for the two groups: 
Deaf, 54-82 per cent; Hearing, 60-34 per cent. Thus on this crude measure the construct system 
of the deaf group is slightly more elaborated than the hearing group, not less as the hypothesis 
predicted. An alternative explanation of this result might be that the consensus grid of the deaf 
group is less consistent, i.e. that a large proportion of the variance not covered by component 1 
was random error. But comparison of the two groups on this component shows that the deaf 
group’s consensus grid is comparable in its structure to the hearing group’s grid (see Table 1), 
and indeed the general degree of correlation between them is = 0-8585. Comparison of the 
element loadings on this evaluative component raises some interesting speculations about the 
psychology of deafness. For the hearing group, loading on ‘Dad’ and ‘Mum’ are almost equal, 
but for the deaf group ‘Mum’ is rated considerably lower than ‘Dad’ in this component 
(negative loadings represent high ratings). Does this represent an overidentification with the male 
parent figure in deaf youths? Is the lower evaluation of ‘teacher’ by the deaf group commentary 
on an education system in which most lessons are still delivered purely through the medium of 
speech? 


Discussion 
This paper does not, of course, indicate that the construct system of the deaf is as elaborate as 


the construct system of hearing people. The poverty of their language makes this unlikely. But 
within the limits set by their restricted vocabulary, it does appear that their view of people in 
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Table 1. Construct and element loading on component 1 








Deaf Hearing 


Loading Loading 





Construct 
1. like - dislike —0-7032 —0-8596 
2. lies all the time - speaks truth 0-9665 0-8206 
3. happy - sad —0-7215 —0:9263 
4. angry all the time - never angry 0-5420 0-5444 
5. has many friends — has few friends only —0-0563 —0-8838 
6. jealous — never jealous 0-8938 0-4786 
7. works hard - lazy —0-8998 —0-8973 
8. worries all the time - never worries —0-9384 —0-7640 
9. clever - stupid —0-8606 —0-6697 
10. drunk again and again - never drunk —0-0800 0.7873 
Element 
]. you —1-4746 —0-4472 
2. Dad —1-0142 —0-9993 
3. Mum —0-3121 —]-0103 
4. Teacher 0:0314 —0-6408 
5. Girl you like 0-5106 0-0940 
6. Girl you don’t like 0-5473 0-4971 
7. Boy you like 0:5766 1:3122 
8. Boy you don’t like NS 1-1349 1:1943 














their immediate orbit is more complex and sophisticated than might be thought. It might also 
confirm the findings of Bellugi & Fischer (1972) that manual communication is not as primitive a 
system as its antagonists maintain. 

This group of deaf apprentices probably represents a better adjusted sample of the deaf 
population than many because of competition for places on the training schemes. Perhaps on a 
more carefully selected sample the findings of this paper might be reversed. But as Furth said, it 
is not very interesting to demonstrate significant but minute differences between the deaf and the 
hearing populations. What is more remarkable is the degree of overlap, which is paralleled by the 
number of normal well-adjusted deaf adults who are useful members of the community. He 
showed that, despite their linguistic handicap, deaf children were able to cope with problems in 
formal logic, presented in symbol form. This paper suggests that deaf teenagers can also cope 
with and organize in a sophisticated fashion the few linguistic concepts at their disposal which 
refer to the personal qualities they see in people. This further adds to our understanding of the 
relationship between ‘thinking’ and ‘language’ explored by Furth. 
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Special review feature: Books on teaching and learning 
in higher education 


Teaching and Learning in Higher Education, 3rd ed. By R. M. Beard. Harmondsworth: Penguin. 1976. 
Pp. 251. £1.00. 

Teaching Students. By D. Bligh, G. J. Ebrahim, D. Jacques & D. W. Piper. Exeter University Teaching 
Services, University of Exeter, 1975. Pp. 286. £2.80. 

How Students Learn. Edited by N. Entwistle and D. Hounsell. Institute of Research and Development in 
Post-Compulsory Education, University of Lancaster. 1975. Pp. 199. £2.95. 

Teaching and Learning in Higher Education. By A. Heim. London: National Foundation for Educational 
Research. 1976. Pp. 134. £3.50. 

The Immortal Profession. By G. Highet. New York: Weybright & Talley. 1976. Pp. 223. $10.00. 

Teaching in the Universities: No One Way. Edited by E. F. Sheffield. Montreal: McGill University Press. 
1974. $5.00. 


The last ten years or so have seen a rapid growth in introductory courses for new members of staff and in the 
appointment of staff directly concerned with the improvement of teaching in higher education. This upsurge 
of interest is currently very strong in the United States: at the annual meeting of the American Psychological 
Association held in Washington in 1976, for example, seven symposia were held and over fifty papers 
delivered on the teaching of psychology. In this country too there is an increasing interest ın research on 
teaching and learning in higher education and this is reflected in four of the books reviewed below. 

Teaching Students is published by the University of Exeter’s Teaching Services Department, although in 
fact it was probably conceived and written when the authors were with the Teaching Methods Unit at London 
University. The book consists of seven chapters, five of which were written by Donald Bligh (on needs, 
assessment, the selection of students, teaching methods, course defects) one by David Jacques (on 
sequencing courses) and one by David Warren Piper (on course design). The sequence of the chapters seems 
somewhat arbitrary, and the ones by Jacques & Piper are more theoretical and less practical than the ones 
by Bligh. 

Chapter 1 (‘What needs should the methods satisfy?’) introduces the notions of aims and objectives in 
education, and expresses the view that the aim of higher education should be to develop in students the arts 
of thinking and of tolerance. The approach seems simple and problems are not discussed in depth. The 
seminal paper on instructional objectives by Macdonald-Ross (1973) is not referred to. Chapter 2 
(‘Assessment of students’) is a good deal better. Reasons for assessment, difficulties of doing it, and the 
advantages and disadvantages of different methods are discussed. Chapter 3 (‘Selection methods and 
students’ circumstances’) might at first seem less relevant to the teaching of students but a good case is made 
for its inclusion. The core of the book, however, lies in chapter 5 (‘Teaching methods’) which occupies over 
one-third of the text. Here Bligh has gathered together an enormous number of researches and briefly 
summarized them - topics include the lecture, discussion methods, practicals, projects, programmed learning, 
individualized instruction, simulations, reading, independent study, and mechanical presentation by radio, 
tape, television and computer. The book concludes (chapter 7) with a short discussion on the evaluation of 
teachers (particularly by students). There are difficulties in presenting such summaries of research and I shall 
discuss some of these shortly, but for the moment Bligh and his colleagues are to be congratulated on 
making so much diverse material readily accessible. 

Ruth Beard’s Teaching and Learning in Higher Education bas now been republished in an enlarged and 
revised (3rd) edition. This text, as before, is divided into three introductory chapters (on planning, objectives 
and the psychology of learning), five chapters on methods (lecturing, small groups, practical and laboratory 
teaching, new techniques and independent study) and a concluding chapter on evaluation. It would appear 
therefore that there is considerable overlap with Teaching Students, but Beard’s bibliography contains about 
350 references whereas Bligh’s contains over 700, and there are less than 100 references common to both 
texts. 

Undoubtedly a problem for the authors of both books is how to select from the literature available which 
bits to use and which to discard, and how to be comprehensive without boring the reader. In both texts 
failures in this respect often lead to one-paragraph descriptions of research results, or even occasionally 
one-sentence ones, which can convey little but frustration Beard, for example, writes at the end of a 
paragraph on student evaluation, ‘A new and objective approach to evaluating teaching has been attempted 
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in a department of economics in Canada (Huber, 1974).’ Such a sentence is unhelpful, for few readers (the 
reviewer included) will have the energy to write to a Canadian department for an unpublished paper. In her 
section on the psychology of learning, to take another example, Beard is clearly faced with limitations of 
space. Here again one sentence descriptions (e.g. of Maslow) might be sufficient for psychologists familiar 
with the background, but surely not for non-psychologist readers? The choice of materials, too, offers food 
for thought: paragraphs on theories of learning, Gagné's varieties of learning, selective perception and 
individual differences are all provided, but the brevity of it all, and the loose relationship to higher education 
which the reader is often left to infer, make its value difficult to assess. 

A closely related problem is that of the selection of data with which to argue a case. Some experiments in 
this field are of dubious validity whereas some are more acceptable. However, if an author simply quotes a 
reference and appears to give an equal weighting to different findings, then the reader has no way of telling 
(without actually searching out and reading the source material) whether or not the experiments cited are 
good or bad. An author, therefore, has a certain responsibility in a text of this kind. In both the Bligh and 
Beard text I am not fully convinced that this responsibility has always been adequately met. There is, for 
example, what I consider to be the perpetuation of myths in both textbooks. I am somewhat sceptical, for 
example, of Bligh’s well-known conclusions on the benefit, or otherwise, of lecturing. (These are that 
lectures convey information successfully, but do not help students to think, or to change attitudes.) Bligh 
explains he has examined over 100 studies to reach these conclusions (but the references are provided 
elsewhere: Bligh, 1972). Bligh does not, however, refer to Costin’s (1972) paper which makes a similar 
comparison. Costin examined more critically the confounding variables that arise in this area of research and 
showed more clearly the tendentiousness of trying to arrive at such over-simple conclusions. Other examples 
of myths being perpetuated are the weight given by both authors to Gilbert’s ideas about backward chaining 
for the sequencing of instruction (which have not stood up to experimental tests in meaningful learning 
situations) and to Freyberg’s study purporting to show that on immediate tests students not taking notes did 
as well as did students taking brief notes, full notes, or receiving a duplicated summary of the main parts of 
the lecture, but that after a period of eight weeks students with the summary did best of all. Inspection of 
Freyberg's results would in fact show the differences between all four groups to be very small indeed - and 
virtually no difference at all between the handout and the notetaking conditions. (Incidentally, Beard’s 
reference to the Freyberg study has remained incorrect throughout her three editions.) 

A final myth I wish to discuss here is that perpetuated by Beard’s sentence, ‘The reader who is not 
familiar with programmed learning may like to know that there are two kinds of programme: the linear 
programme devised by Skinner (1958) and the branching programme originally advocated by Crowder.’ This 
picture of programmed learning is a good ten years out of date and, judging by her later discussion, Beard 
should know this. This myth is further strengthened by a one paragraph discussion on computer-assisted 
learning, which, as far as Beard is concerned, seems to be a good way of presenting branching programs and 
nothing else. Her section on algonthms is weak, and the Keller method is only mentioned in one sentence. 
Bligh’s text is much to be preferred on these points. However, comparing these two books together, and 
particularly when thinking of them in terms of their suitability for new members of staff, Beard’s text is 
probably the most suitable. This is because Beard seems to be more in charge of the material, marshalling it 
to her own ends and arguments, and by covering both teaching and learning she has a wider scope. Bligh’s 
text, with its different chapters written by different authors and its strange sequencing, lacks this cohesion 
and at times reads like an extended catalogue. 

Teaching in the Universities is quite a different sort of book from that by Bligh or Beard: it is not about 
research evidence, it is about what people think. The author, Edward Sheffield, hit upon the intriguing notion 
of asking about 7000 graduates from 19 Canadian universities to name the teachers they had experienced as 
students whom they considered excellent and to say what it was about these teachers which had made them 
so effective. Approximately 1000 students replied, and 23 ‘star figures’ emerged. Each of these was then 
invited to contribute a chapter to the book explaining how he taught, and finally the authors’ chapters are 
juxtaposed with a selection from the students’ comments. The book concludes with an analysis of 
commonalities and differences between the teachers. The points to emerge most often are that good teachers 
are perceived to be enthusiastic masters of their subject matter, ready to give well-prepared and orderly 
lectures which demand considerable preparation. What is abundantly clear from the different chapters, 
however, is that the teachers differ greatly in their personalities and methods and that there is no one way to 
be an effective teacher. Sheffield concludes that good teaching is a matter of attitude — a liking for the 
subject, an enthusiasm for communicating it and a respect for the students. 

Sheffield’s text presents the views of 23 eminent teachers — each expressed in separate chapters. Gilbert 
Highet's The Immortal Profession is the viewpoint of one eminent scholar expressed in a full length text. Of 
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particular interest are chapters on ‘the scholarly life’, ‘communication’ and the ‘need for renewal’. The 
chapter ‘teaching college teachers how to teach’ does not discuss research findings but simply opens out the 
discussion on what new teachers should be taught, and how, when and from whom they should gain this 
information. New teachers, Highet argues, need to be taught how to communicate clearly, how to prepare 
for class, how to set examinations, how to be sympathetic to students and so forth: but more than this they 
need to recognize that active energetic teaching is good for the soul. These things, he argues, cannot be 
learned in short courses on teaching methods: one learns best by watching other (master) teachers, and by the 
constant monitoring of one’s own progress. The Immortal Profession is a graciously written text which 
defends traditional values in university teaching. It is a non-technical, non-research oriented study which 
would be particularly appropriate for humanities teachers. 

If Gilbert Highet’s text has a slightly transatlantic touch about it: Alice Heim’s Teaching and Learning in 
Higher Education has a definite Oxbridge flavour. The aim of this text is primarily to consider the pros and 
cons of lecturing. In addition there are chapters on tutorials, seminars, assessment and so forth. Topics raised 
in this text which are not discussed in the ones described above are lecturing to disabled students, training in 
graduate research and the role of internal and external examiners in this connection. Alice Heim aims to 
show that lectures can challenge students to think, and she is particularly in favour of encouraging 
participation and dialogue in lecture-type situations. 

The method used to achieve these objectives is that of forceful argument, supported by appropriate 
anecdotes or quotations provided by students and colleagues, with very little reference to any research 
findings. Indeed, in a chapter entitled ‘Why yet another book on University Teaching?’ Dr Heim writes ‘the 
bibliography on pages 133 and 134 (which contains twenty-two references) is a list selected from a 
proliferation of books and reports on the subject of university teaching - some of which I have not seen and 
few of which I have read from cover to cover’. The little research that is discussed in this particular chapter 
seems well selected to illustrate Heim’s main points and her conclusions: ‘my impression then is that much 
of the writing on university teaching is inconclusive: it is sometimes painfully obvious, yet sometimes lacking 
in clarity; often it is pedantic and prolix, inappropriate in its form, and its data are derived from sources 
whose pertinence 1s marginal if not non-existent’. While this may be true (and let us face it, it i$ true of 
much of the research) the implication that a psychologist writing a book on a topic need not bother to read 
the available research literature seems to me to be somewhat startling. Agreed there is probably much tacit 
knowledge amongst university teachers that can be intelligently spelled out, but I would have thought that 
the aim of a psychologist in the field would be to help provide evidence that would support or reject some of 
the value judgements at presently held in the world of higher education. . . Be that as it may, one is left with 
a forceful, provocative and often witty book that is essentially practical in its advice to university teachers. 
It is to be regretted that for its size it is expensive, that it has the same title as Ruth Beard’s text on the 
same topic, and that it is not as well written as The Immortal Profession. 

How Students Learn is a set of papers which spans the ‘dimension running from a mechanistic controlled 
view of learning to freer, humanistic ideas about the importance of personal development’. The editors have 
been careful to choose readable examples from eminent psychologists and, by selective editing, they have 
made the papers more comprehensible to a wider audience. Thus, amongst others, there are papers by Gagné 
and Skinner; Broadbent, Lindsey and Norman; Ausubel and Bruner; Rogers and Maslow There is an 
exposition of the work of Gordon Pask (by John Daniel), a piece on subjective experiences in schoolboys 
(by Michael Paffard), and a particularly interesting paper by Marton on how students read instructional text. 
An overview paper by the editors entitled ‘How students learn: implications for teaching in higher 
education’ concludes the book. 

This text is an excellent source for contrasting viewpoints on how students learn, but it is disappointing, 
however, in one respect. Somehow I do not get from the book the flavour of what I think it must be like to 
be a student learning. The paper by Marton provides a notable exception, and one by Perry includes a short 
quotation or two from students. Otherwise there is nothing to indicate that students speak to each other, that 
they move in a social world, and that they adopt a variety of strategies to deal with the complex situations in 
which they find themselves. Strangely enough Highet’s The Immortal Profession presents a far more 
sympathetic perception of the problems facing students learning. This omission may perhaps be due to the 
myopic nature of much of the research conducted by psychologists in this area, but there are instances of 
where such studies have been done, and it would have been nice to have extracts from such examples 
included. An excerpt from Parlett & Millers’ (1974) Up to the Mark might have been appropriate, for 
instance. 

Research on teaching is not easy to do, and the findings are easily rejected (‘my situation is not like that’). 
Yet such research presents a strong challenge to psychologists, particularly academic ones who spend much 
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of their time teaching. Perhaps much of the reluctance to carry out such research stems from a dim but 
unpleasant awareness that what little we know about how learning takes place lies in direct contradiction to 
how current instructional settings force us to go about teaching. Books such as the ones reviewed, however, 
and courses such as the ones mentioned in the introduction, will I hope make academic psychologists more 
aware of the rich field for research in which many - like three proverbial brass monkeys — presently sit. 
JAMES HARTLEY 
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Book reviews 


Early Experience: Myth and evidence. By Ann M. Clarke & A. D. B. Clarke. London: Open Books. 1976. 
Pp. xiv+314. £3.95. 


The aim of this book is to present a critical study of the widely held view that ‘the environment in the early 
years exerts a disproportionate and irreversible effect on a rapidly developing organism, compared with the 
potential for later environmental influences’. The authors have assembled several eminent workers in the 
field of child development who contribute ten chapters describing relevant research, which are linked 
together by five overview chapters written by the authors. 

The introductory chapter by the authors is in many ways the most interesting and important one. It 
outlines the historical background of Western thinking from Plato to Bowlby, which has consistently 
emphasized the crucial importance of the early years. It examines carefully the assumptions which lie behind 
this conventional view, and discusses the flaws in most of the evidence and arguments which are advanced 
to support it. For example, the authors point out the fallacious nature of the arguments which automatically 
leap from correlation to causation, and particularly criticize the naive view that the existence of high 
correlations between early and later behaviour ‘proves’ the overriding importance of the early years. 

The authors indicate the harmful effects such thinking has had on social policy, such as the tendency to 
‘label’ children as a result of early adverse experiences. It is a pity, however, that the authors have not gone 
deeper into this area. There are many subtle ways in which our thinking is conditioned by cultural myths to 
the extent that the need for critical analysis may often not be apparent. It would be interesting, for example, 
to go through the social policy debates of the last 20 years to see where the 'early years' assumptions have 
been unquestionably accepted by all concerned. 

The two chapters by Davis and Koluchova which follow the introduction, discuss cases of children who 
had suffered extreme isolation resulting in severe physical, emotional and intellectual deprivation, but who 
were subsequently rehabilitated and achieved normal levels of development. These cases are used by the 
Clarkes to argue that there is no ‘critical period’ after which recovery is impossible. Although in one sense 
such cases do disprove the straightforward critical period hypothesis, it is still quite possible to hold a 
modified form of the hypothesis which incorporates conditions for a critical period to exist, conditions which 
themselves may be partly determined by early experiences. It is disappointing that the Clarkes limit 
themselves to criticizing the simple hypothesis without going on to discuss, even briefly, how available 
evidence might suggest profitable modifications. 

The next section has contributions by Kagan, Dennis, Tizard, Rees, Rutter and Kadushin and deals with 
‘natural experiments’, such as studies of groups of illegitimate children who have subsequently been 
adopted. It differs from the first section in that the studies are based on adequate samples and the abnormal 
environments passed through by the children are not nearly so extreme. It is the longest section, which 
seems to be a reflexion of where most of the research has been done. As the authors rightly point out, 
however, it is very difficult to infer causal relationships from correlations. Nevertheless, much of the 
evidence seems to be optimistic in the sense that it shows that what might have been thought to have been 
permanently damaging effects of particular early environments, such as institutional care, are not borne out 
by the evidence. 

The final section discusses controlled experiments of the effects of different early experiences. The 
accumulated results of these, including the large-scale American intervention projects started in the 1960s are 
set out clearly. The chapter by Bronfenbrenner attempts to draw lessons from the American experience and 
suggests approaches for the future, and among other things emphasizes the need to continue intervention 
beyond the early years. 

Throughout the book the authors’ critical commentaries demonstrate the inadequacies of the conventional 
wisdom. Sometimes, however, they do seem to get a little too carried away by their enthusiasm into rather 
indefensible positions. For example, they present the evidence by Douglas that long or repeated admissions 
to hospital before the age of five are associated with behaviour disturbance and poor reading during 
adolescence; a finding also supported, at least in part, by evidence from a study by Rutter. The authors 
discuss a whole range of possible explanations for this finding which are alternatives to a causal link, the 
implication being that the existence of a causal relationship over this time span would destroy their 
arguments. Yet their general thesis does not deny that some early effects may be related to some later 
events, merely that the links are not as rigid as it has been customary to assume. 
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Although it wes not one of the authors’ aims, it is disappointing to find little discussion of alternative 
models or frameworks for studying influences on development. The authors tentatively advance their own 
‘wedge’ hypothesis which states that the potential for change in response to the environment decreases 
steadily, and without distinct critical periods, as children get older. While this seems reasonable, it isn’t 
specific enough to suggest how a more substantial theoretical structure might be developed. Another lack is 
that of an adequate discussion of physical development. It would have been interesting, for example, to have 
presented the evidence on catch-up growth, some of which relates closely to the idea of critical periods. 

Nevertheless, this is an important book which deserves to be read carefully and taken seriously by all 
those who are concerned with theoretical problems and practical policies relevant to child development. 
HARVEY GOLDSTEIN 


Professional Approaches with Parents of Handicapped Children. Edited by Elizabeth J. Webster with a 
Foreword by Alan J. Weston. Springfield, Il.: Charles C. Thomas. 1976. Pp. 268-+xxiii. $10.95. 


The best that can be said of this book is that it is a curate's egg; the worst that it is a rather syrupy pudding, 
with a few good plums to be pulled out. From the title we hope for practicalities, but though some are to be 
found they are too often buried in a mass of fuzzies; often grandiose ones. 

Kushlick (1975) has drawn attention to the relevance to the services for the handicapped of Mager's (1972) 
discrimination between fuzzies and performance statements; a fuzzy being a proposition which fails the ‘Hey 
Dad’ test, i.e.: 


Hey Dad, let me show you how I can 
develop my full potential 
develop a mature approach 
develop self awareness 
increase my sensitivity to the environment 
etc., etc. 


compared with performance statements which pass, such as: 


Hey dad, let me show you how I can 
smile a lot 
say favourable things about others 
recognize symptoms 
assemble camponents skillfully 
etc. 


There is a gocd deal in this book which will not pass the ‘Hey Dad’ test; there 1s also, except in two 
specific instances, a tendency to regard the parent as a patient, with a disease of which the child is a 
symptom. Sometimes this is coupled with a stifling sentimentality which would be instantly rejected, in my 
experience, by the parents to whom it might be directed. 

The Foreworc, by Professor A. J. Weston, is somewhat forbidding, and calls into question the actions of 
many who work with parents. Weston believes that professionals who ask parents to function as clinicians or 
educators are shirking their responsibilities. By doing so, we are asking, apparently, for ‘a disgruntled 
parent’ who is being presented with an ‘impossible task’. Further (in the US presumably), he suggests that 
parents may begin to question why services should be paid for when they can do them themselves? 

Behind the words seems to be a kind of professionalism which is as much concerned with protecting itself as 
with giving help to parents. 

The first chapter, by A. T. Murphy, is a positive haystack of fuzzies; phrases abound like ‘At the root of 
mature living is the creativity which faces insecurity as the growing point in life’ or ‘The basic urge to fuller 
humanness as a universal life force’. Dr Murphy seems actually to prefer fuzzies: one section is headed 
“Counselling goals as guides rather than fixed points’, and his adoption of the Maslovian phrase ‘that which 
one loves one is prepared to leave alone’ seems to beg the whole question of professional/parent interaction; 
‘loving perception’ to this field worker means anything but being prepared to leave alone. 

The contributon of B. J. Williams is humane and contains many good points, but throughout the feeling 
remains that clinicians expect parents to have poor unproductive relationships with their handicapped 
children and that therefore the task is to modify these relationships. The implication is that parents need to 
prove their acceptance of the child, otherwise the clinician will assume rejection. This is pre-Olshansky 
thinking (see Wolfensberger, 1968) in which the parent is the chronic sick who ‘hopefully. . . will seek 
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assistance as it is required as much for their son or daughter as for themselves’ (my italics). McWilliams’ 
orientation is made clear in the sentence ‘The child’s major problem and the management it requires are 
often intimate parts of the need for counselling and relate directly to the content of parent work’. Not 
‘often’, always. 

A. Simmons-Martin describes an approach with hearing impaired children based on a simulated home 
situation in which parents are taught to make use of the normal appurtenances of the home in teaching their 
children. This has the advantage over home visits of saving the clinician’s time, but if the surrogate home is 
an adequate model for all, there must be considerable class-limitation among clients. 

Chapter V, in which G. L. Wyatt reviews different methods with parents and siblings does not add a lot to 
the existing literature on behaviour modification or other therapeutic approaches. Some methods described 
fill the staid British mind with horror; for example Speck & Attneave’s (1973) idea of a ‘supportive tribe’ 
consisting of up to 50 people, parents, siblings, ‘schoolmates, neighbours, relatives, storekeepers, policemen, 
teachers, principals and others. . .this large network meets for several evening sessions given to intense 
discussion, interaction and psychodynamic exploration of the needs of the child and his family . . .the group 
goes through various stages of mutual interaction which at times may become quite emotionally charged’. 
One can well i imagine it. This kind of approach, however well intentioned, reinforces the families’ idea of 
themselves as peculiar, different, sick or a burden on society. 

Similarly, F. C. Taylor’s description of ‘Project Cope’ displays the same overintense concern, and total 
absence of a sense of humour. A single extract gives the flavour: ‘The leader [in a group session for 
mothers]. . .spends a few moments extolling mothers who have handicapped children. She emphasizes that 
once there is a handicapped child in the family, it is importaiit for the mother to garner her strength to find 
needed medical and community resources and to keep in mind that as diamonds are made by pressure so can 
human beings become strong under pressure. Then a poem such as *' Heaven's special child” is read and the 
leader praises the mothers for their efforts to help their children find their place in the community. 

The chapter by M. Todd & M. I. Gottlieb on ‘Interdisciplinary counselling’ and that by V. T. Falck on 
‘Future programmes’ are rather pedestrian, similarly counsellor oriented, but contain some good 
performance statements among the verbiage. Still, however, we retain the impression of the parent as 
patient. 

Chapter III, dealing with approaches to parents of autistic children, by N. L. Doernberg, M. B. Bernard & 
C. F. Lenz, is one of the few plums in the book; it combines practicality with humility. The remarks about 
the dangers of the myth of professional omniscience and omnipotence are most apposite. The professional 
who attempts to present him or herself as a gentle angel of light and mercy who is never angry, frustrated or 
at a loss for ideas is doing the client a disservice. 

Only this and chapter VIII, by N. E. Bissell, rescue the book from the slough of sentiment. Bissell’s 
chapter is to be recommended to all the other contributors, and to all field workers. He is sensible, 
humourous and self-critical. He rejects the ‘pathological profile’ of the parent of the exceptional child, and 
his comment that ‘many professionals have never really meant or understood the implications of full team 
membership for parents. The team they conceptualized was captained or coached by the professional’ brings 
into focus the opposed attitudes of most of the other contributors. Apart from these two chapters, the book 
is disappointing; verbose, and backward-looking. Workers on this side of the Atlantic (and their clients) will 
find its sentimentality counterproductive. 

SARAH SANDOW 
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Patterns of Psychological Thought: Readings in Historical and Contemporary Texts. By James R. Averill. New 
York: Halstead Press (John Wiley). 1976. Pp. xii+603. £16.00. 


Two-thirds of this book consists of readings, one-third of introductory essays, and a pardonable first reaction 
might well be ‘ Another anthology! Haven’t we had enough of anthologies?’ 

Such a reaction would be a pity, because this is an anthology with a difference, and, in fact, Dr Averill has 
produced a most stimulating and original book. He takes ten major figures in the history of psychological 


264 Book reviews 


thought (Plato, Aristotle, Plotinus, Augustine, Aquinas, Descartes, Hume, Kant, Darwin and Marx); gives 
sizeable extracts from the writings of each of them; and pairs each classic writer with a contemporary, from 
whose works there are also longish extracts. ‘The purpose, he says, is not to demonstrate how the historical 
figure anticipated the modern counterpart, but rather to show how both have something to say about an issue 
of contemporary concern.’ The marriages between ancient and modern may seem somewhat ill-matched 
(Plato and Lawrence Kohlberg, Aristotle and Charles Taylor, Plontinus and Charles Tart!), but surprisingly 
they work. The passages have been exceedingly well chosen, and combined with brief, but excellent, 
introductions, they really illuminate both the old and the new, and highlight the different ‘patterns’ of 
psychological thought, or approaches to psychological problems (rational, empirical, mystical-formal, 
mechanistic, organismic, etc.) which Dr Averill sets out schematically in the introductory essay, 
supplementing it with an essay by Thomas S. Kuhn on the function of dogma in scientific research. 

Perhaps the only ill-matched pair is that of Hume and Gordon Allport. Had these two ever met, they 
would certainly never have seen eye to eye, and in fact the extracts here illustrate contrasting, not 
concordant, viewpoints. But without exception the other pairings are successful. One can imagine Plato 
approving Kohlberg's ‘cultural universals’, and St Thomas blessing Magda Arnold. And Lorenz with his 
support for the doctrine of the a priori in the light of contemporary biology, would no doubt have pleased, 
even if he did not entirely satisfy, the transcendental Kant. The inclusion of Marx is also a valuable one. 
Many psychologists, even today, are innocent of any first-hand acquaintance with the writings of Karl Marx, 
and what Marx has to say on alienation has contemporary relevance, quite apart from the significance of his 
general standpoint. The matching passage from Luria, with an account of his researches in Soviet Central 
Asia, is a most interesting illustration of the effects of cultural change on psychological development. 

Obviously a good many more classic writers might have been included, and the book is already a big one. 
But was it impossible to squeeze in something from Locke, Leibniz, Spinoza and Berkeley, and to find 
modern representatives of their points of view? However it is ungracious to complain when Dr Averill has 
given us so much. 'If more psychologists, he writes, took the time and effort to examine historical sources, 
perhaps psychology as a science would be subject to fewer '' discoveries '' and greater progress.’ How true! I 
have no hesitation in saying that any student who worked through this book would have a far firmer grasp of 
the major patterns of psychological thought, and would be far less philosophically and historically naive in 
his approach to psychology. 

L. S. HEARNSHAW i 


Philosophical Dimensions of Parapsychology. Edited by James M. O. Wheatley and Hoyt L. Edge. Springfield 
Il.: Charles C. Thomas. 1976. Pp. xxix+483. $24.50. 


This anthology contains 27 contributions, two-thirds by philosophers, the remainder predominantly by 
psychologists and parapsychologists. The book is divided into five sections — Psi and Philosophy, Cognition, 
Precognition, Survival and Science. (For the uninitiated, psi is extra-sensory perception and psychokinesis.) 
A general introduction to the book is provided by Wheatley and each section has a short introduction by 
Edge. 

An early theme is that the facts investigated by parapsychology are incompatible with the conclusions of 
science, and unintelligible within the framework of our normal thought about the world. So much so that if 
we are forced to accept them we shall need to revise the concepts and principles that determine our present 
framework of thought and science. According to C. D. Broad (p. 12) it is the job of the philosopher to make 
these revisions - a view echoed by H. H. Price (p. 108) when he says it is the philosopher's ‘business to 
bring about a reconciliation' between parapsychology and science, if the facts disclosed by the former 
conflict with the latter. A different view is expressed by C. W. K. Mundle - ' But let's not alarm the 
psychologists [by constructing psychic and metaphysical theories]. Let's just challenge them to confirm the 
facts and try to explain them' (p. 97). 

However, Mundle's paper and many others including several by Broad and Price show that the 
philosopher's interest has been aroused not just by the existence of 'strange facts in search of a theory' but 
because these facts have seemed highly relevant to established philosophical interests such as mind-body 
dualism, life after death, causality, time, perception, memory, etc. These topics constantly recur throughout 
the book. 

A striking feature is that nearly all the papers by non-philosophers are predominantly concerned with the 
relation between parapsychology and science. It seems that parapsychologists feel the need either to 
establish their subject as a science or to establish comfortable relations with science. This, it seems to me, is 
not itself of distinctively philosophical interest, despite the views of Price and Broad referred to above. An 
interesting aspect of the papers on this theme, however, is the diversity of approaches. 
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J. Beloff uncompromisingly denies that parapsychology is a science on the grounds that it has failed to 
meet the demand of science that its experiments should be repeatable. This criticism is at least partly met by 
M. Scriven, a philosopher (pp. 64-65). But this is not the issue for all the authors. There is the attempt to 
show that ESP at least might be accommodated by science in its present state (W. G. Roll). And close to this 
is the suggestion that a physics of the future will embrace parapsychology (R. Chauvin). But some of the 
authors plainly believe that science as we know it will not accommodate parapsychology and in different 
ways argue the need to recognize the independent validity of psi alongside science. 

A rather worrying aspect of the book is that virtually all the authors are clearly committed to 
parapsychology, and what is more important, their philosophical interest in parapsychology seems to derive 
from this commitment. A dimension that is not explored is the fact that the philosopher can find 
parapsychology interesting without having this kind of commitment to it. Philosophy feeds off what can be 
imagined as well as off what is known to be the case, and parapsychology can be an important stimulus to 
the philosophical imagination. For example, because of parapsychology we can imagine people generally to 
have psychokinetic powers and we can raise the question of how this would affect our understanding of 
ourselves and others. Inevitably our understanding of such things as intention and human action, and 
therefore our view of what a human being is, would have to undergo radical change. 

To some this exercise of the imagination will appear to be pointless fantasizing. But our philosophical 
understanding of what we are is enlarged and deepened by our knowledge of what we might have been or 
might yet be. 


EDGAR PAGE 


The Psychology of Left and Right. By Michael C. Corballis and Ivan L. Beale. Hillsdale, N.J.: Lawrence 
Erlbaum. 1976. Pp. x--227. £10.85. 


How many messages must have travelled between the hemispheres, northern and southern, in the 
collaboration of Michael Corballis of McGill University, Canada, and Ivan Beale of the University of 
Auckland, New Zealand. The partnership began in 1966 at Auckland with work on mirror image 
discrimination in pigeons. Each then pursued related left-right problems with other colleagues. They met 
again at Auckland in 1974—5 to put the book together. In 1970 Corballis & Beale suggested that a perfectly 
bilaterally symmetrical system would be unable to differentiate mirror mages and that some asymmetry must 
be introduced before the organism could ‘tell left from right’. This book is about the problems of ‘telling left 
from right’. Whether or not the authors have got the problems thoroughly sorted out and whether or not their 
hunches prove correct, their analysis should be considered carefully by all working on problems of laterality. 

The preface points out that left-right problems arise in many areas of psychology and in several other 
disciplines. The authors were obliged to read well beyond their own specialisms, the review could not be 
exhaustive and the selection of material was bound to be somewhat idiosyncratic. After introductory 
remarks about the funny things mirrors do to our visual world and what are known as 'enantiomorphs `, the 
first half of the book reviews experiments on left-right sense as shown in perceptual discriminations, 
response differentiations, the perception of symmetry and mirror image confusions. It considers theories of 
pattern recognition and the possible significance of structural symmetries in the central nervous system for 
the explanation of experimental findings. There are a couple of chapters on the evolution, inheritance and 
ontogeny of the structural and behavioural asymmetries relevant to handedness and cerebral speech 
laterality. Three chapters consider the growth of left-right sense in childhood, the possible role of confusions 
of orientation in reading and writing difficulties and some left-right problems associated with 
neuropsychological disorders. The final chapter looks at a miscellany of left-right asymmetnes, especially 
the weak interactions of nuclear decay which led to the fall of parity assumptions in theoretical physics. As 
to the relevance of asymmetries in physics and in biochemistry for the psychology of left and right the 
authors confess they ‘have no idea’. There may well be none at all, but I agree that psychologists should be 
aware of the range of left-right asymmetries in nature. 

Speculations about laterality are legion and the literature grows more and more unwieldy on account of 
sheer bulk. Corballis & Beale cite a great many ideas, often contradictory, and a few in my view hardly 
worth resurrecting. They add some more of their own. How it is all to be evaluated the reader must decide. 

The chief theme of the book, that the symmetry of the nervous system has implications for the 
discrimination and labelling of symmetrical and mirror image stimuli derives from Ernest Mach, writing at 
the end of the last century. Some of Mach’s ideas about the psychological consequences of physiological 
arrangements have proved valuable but others now seem simplistic The ingenuity of evolutionary design is 
greater than Mach could have envisaged. Corballis & Beale claim that they have been able to sustain Mach's 
thesis through a long review of conflicting evidence. The claim seems a matter of faith rather than proof. 
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Perhaps the orientation of a message transmitted from one hemisphere to another is of no more relevance to 
its perception and coding than the fact that a letter received in Auckland from McGill would be upside down 
from the point of view of the sender. 

MARION ANNETT 


Toward the validation of dynamic psychotherapy - a replication. By D. H. Malan. Plenum Medical. 1976. 


This intriguing book is the successor of A Study of Brief Psychotherapy (1963), by the same author; it shows 
the same strengths and weaknesses, and indeed incorporates some of the same data and methods of analysis. 
Malan has always been rather a maverick in the psychoanalytic field, accepting unpleasant truths much more 
readily than most of his colleagues; he has not lost this characteristic in the meantime. Thus he accepts as a 
fact of research in psychotherapy ‘the inability to demonstrate from controlled studies, that psychotherapy 
brings about any greater improvement than life experience alone'. He also recognizes as a 'paralyzing 
weakness’ in psychoanalysis ‘the split between researchers and clinicians’, and suggests that ‘the undoubted 
difficulty of carrying out research on materials of such subtlety and complexity is being used as a defense 
against anxiety, the anxiety of finding that cherished beliefs are incorrect or that therapeutic methods are 
ineffective’. Such honesty is rare, and Malan’s long-continued efforts to make short-term psychoanalysis 
acceptable to his colleagues, and to provide empirical support for his endeavours, deserve the highest praise. 
Can we agree with him that ‘from these two series we thus seem to have come upon something for which 
the world of psychoanalysis has been waiting ever since Freud’s original discoveries, namely, strong 
research evidence for therapeutic factors specific to psychoanalytic therapy’? My own answer would be - 
alas, no. i 

The book essentially consists of a detailed examination of the treatment of the patients in the first and 
the second series (i.e. the first and the present, second book); sometimes these are combined, sometimes the 
second series is used to cross-validate the results of the first. Malan is aware of the difficulty of interpreting 
the significance level of the very large number of (not independent) correlations calculated on a rather small 
sample; with so many correlations, quite a large number must appear ‘significant’ by chance, or even appear 
to be ‘replicated’ by chance. Unfortunately this fact is stated only late in the book, in a special short 
chapter which is likely to be omitted by the less numerate readers; it should have been made clear much 
earlier, and in connexion with the discussion of the major ‘conclusions’. Malan’s case depends completely 
on the correlations between outcome and various treatment variables, from motivation to ‘focality’, etc. If 
these are not to be trusted, and are found not to be replicable, then nothing much remains of his claim. 
There is no attempt to have a proper control group, in spite of Malan's clear recognition (based on his own 
work) that spontaneous remission can be a strong factor, and that even single diagnostic interviews (not 
followed by treatment) can have strong effects. I do not believe that without a proper control group (which 
in this case would have to be a placebo treatment group, in order to control for non-specific factors) any 
far-ranging conclusion is possible from a study such as this, although I recognize that for anyone already 
convinced that psychoanalytic treatment works such caution may seem unreasonable. 

I am also somewhat disturbed by the high degree of selection that has entered into the choice of the 
treatment group. They seem to represent a sample less seriously ill, with much greater recovery potential, 
than the typical neurotic patient treated in our department; one would have liked a proper numerical estimate 
of the degree of selection practised. What effect this selectivity may have had it 1s difficult to say; it certainly 
means that conclusions cannot apply to the general population of neurotics, but merely to a small, 
unrepresentative sample. Against these negative evaluations we must set some positive ones, such as the 
care exercised over the rating procedures, which emerge as remarkably reliable; the presence of reasonably 
refined statistical methods of dealing with the data; and the caution with which Malan often (unfortunately 
not always) approaches conclusions. The book is much too detailed to try and even mention the major 
conclusions; readers interested in this general area will without doubt wish to read the account for 
themselves. What remains in the mind of one reader at least is the odd juxtaposition of critical, sensible 
caution and enthusiastic, credulous belief in the mind of the author. If he could only have restrained the 
second aspect, and have still further cultivated the former, this would indeed have been a book in a 
thousand. 

H. J. EYSENCK 


Keeping Patients in Psychiatric Treatment. By Chaim M. Rosenberg and Anthony E. Raynes. Massachusetts: 
Ballinger. 1976. £10.40. 


Every psychiatrist is aware that a substantial number of patients fail to keep appointments. Rather 
superficially he blames this upon a variety of factors. Perhaps it is a wet day and no-one wants to come; 
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perhaps it is a fine day and it is too pleasant in the garden. Perhaps if the number of defaulters gets too great 
he has a sneaking feeling that he is failing to come up to his patients’ expectations — or alternatively he is 
confirmed in his suspicion that his patients are not entirely worthy of his care. 

It will come as a surprise to the average psychiatrist to see how many papers have been written on the 
subject of the defaulting patient. The authors of the work here reviewed quote from some 250 papers, most 
of them from the United States but an acceptable number from Britain. 

Before such conclusions as they come to can be accepted in this country, it has to be remembered that in 
the US psychiatry seems to be largely equated with psychotherapy with an analytical bias, and of course the 
patients are mainly fee-paying. In a series of thoughtful, clearly written sections, all the major related 
variables are considered. It is a relief to know that the ‘fall-out rate’ is universally high. For every definite 
opinion made by one authority there is a contrary view expressed by another 

It might perhaps be thought that young, intelligent patients from the upper social groups would have a 
better attendance record than the unemployed or unskilled. This and other generally held beliefs are critically 
examined. One must question the absolute usefulness of the book. A great deal of work has gone into its 
writing, but the reader is bound to ask himself what has been achieved. The fact that one in six of 
out-patients on any given occasion fails to attend and is never seen again is well known. The authors give no 
reascn to suppose that this figure can be improved upon. The age of the patient, his social class, his colour, 
his degree of affluence, the liking or otherwise of the psychiatrist for the patient are all subjected to the most 
exhaustive examination, with wide reference to North American and British literature. 

Certainly, any psychiatrist or indeed, any therapist, pondering on his success as judged by his attendance 
rate, will find much to comfort him in this book. There is little to suggest that anything will alter the rate of 
defaulting. If this opinion is thought unfairly critical it is not meant to disparage the work, which brings out 
into the open in a most readable form, all the major factors and the fact that it comes to no very definite 
conclusion which can improve things is an indication that the problem is insoluble. 

J. A. R. BICKFORD 


Gambling and Society: Interdisciplinary Studies on the Subject of Gambling. Edited by W. R. Eadington. 
Springfield, I1.: C. C. Thomas. 1976. Pp. xix+466. $34.75. 


This book brings together studies of gambling from five different disciplines: law, economics, sociology, 
psychology and mathematics, in that order. But that does not make it, as the title suggests, 
*interdisciplinary '. What we have is five ‘parts’ published as a single volume, which may be useful, because 
a serious student of gambling cannot afford to ignore any of its aspects. The designation ‘interdisciplinary’, 
however, demands more than a mere juxtaposition of ‘parts’. It requires that the basic concepts in the 
disciplines concerned should be common to two or more of them, and that they should be employed in the 
same sense in these various disciplines. But this may be too much to ask. 

The editor states in his Preface that the aim of the book is to explore ‘different facets of gambling, how it 
affects the individual, the society in which the gambler exists, and where present trends in attitudes are 
taking us’. Later on, he writes that the entire book is occupied with the question of the legality versus 
illegality of gambling. The various chapters are papers presented at a Conference at Las Vegas, Nevada, in 
1974. It is not surprising therefore that they vary widely in quality. The editor, indeed, warns the reader to 
be ‘somewhat careful in what he accepts as ‘‘true’’’, if only because the statistics cited ‘are usually about as 
reliable as the sources from which they were drawn’. We can agree, as he suggests that disciplines can 
benefit when they ‘interface on the same problem’. It is to be doubted, however, where there is any real 
‘interfacing’ in this book. 

Part I begins, in first gear, somewhat inauspiciously, with a racy account of American gambling practices, 
from which the reader can learn of the great many ways whereby it is possible to cheat. From chapter 3, in 
Part I, we learn about the spread of legalized lotteries in the US. At least 14 States now have official lotteries 
which provide about 3 per cent of the revenue of each State. Chapter 4, on ‘The legalization of gambling’, 
has nothing to do either with ‘legalization’ or with ‘gambling’. It is a journalistic essay on the effects of 
‘forcing’ people to watch football matches on TV, instead of as spectators at the match. In 1970, we are 
told, more than 300 sets were destroyed during games, evidently because of viewers’ frustrations. 

Part II on the ‘Economics of gambling’ is more robust. The argument is more rigorous, a number of 
interesting models of gambling behaviour are described, and much data taken from the Nevada Gaming 
Abstract presented and interpreted. G. Ignatin & R. F. Smith describe parallels between gambling markets 
and speculative financial markets. T. Tsukahara & H. J. Brum offer a model for a casino player which 
exemplifies how gambling may be interpreted as economic behaviour. Chapter 7, by M. E. Canes, which 
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follows, tells us about illegal markets for football betting and the legal casino ‘industry’ in Nevada. It 
appears that the amount of play and profitability of games vary a great deal from one area of Nevada to 
another, no doubt reflecting the kind of tourist who is attracted. Finally, an instructive comparison is drawn 
by the editor between the economic performance of publicly owned casinos and public ‘leisure activity’ 
corporations. There may be a moral here for the British scene. 

Part III on ‘The sociology of gambling’ I find much less impressive or convincing. One sanguine 
contributor attempts, by means of a Gallup survey, to test the hypothesis that the tendency to gamble 
increases with increased income. But the very concept of ‘gamble’ here remains in dispute. Nor is the 
chapter on ‘Motivations to gamble’, which one might expect to be psychological rather than sociological, 
any more satisfactory. It is true, however, as has long been known, that games vary with respect to their 
dependence on chance or skill respectively, and that different chance-skill combinations appeal to different 
people, as British experiments (not mentioned in the book) have demonstrated. The last contribution in this 
part deals with the ‘alienation’ of casino card dealers, a topic which appears to be presented here for the 
first time. 

Readers of this Journal will probably be most interested in Part IV, ‘The psychology of gambling’. This 
begins with a brief sketch, ‘How gambling saved me from a mis-spent sabbatical’, by I. Kusyszyn, who 
thinks that, for most players, gambling reflects mental health rather than disease. D. P. Campbell examines 
the relation to gambling of people in different occupations. Here it remains unclear whether they choose an 
occupation because they liked (or did not like) gambling or whether the occupation itself bred a certain 
evaluation of gambling. In the next chapter T. Knapp repeats the familiar notion that the ‘variable ratio 
schedule’ of rewards provided by gaming machines encourages gambling. E. Knowles distinguishes between 
styles of risk and motives for risk. The former, studied in the laboratory, do not very well predict the latter, 
which belong to everyday life. 

The problematic issue as to whether there is a general risk-taking factor is discussed, but not convincingly 
or profoundly, by D. M. Kuhlman, while the controversial question of ‘risky shift’ is considered by G. P. 
Ginsburg et al., who do not appear to be familiar with all that might be said in criticism of this idea. The last 
two contributions to this section have the ‘compulsive gambler’ as their theme. The first, by T. Martinez, 
traces the ‘trajectory’ of a gambler from ‘take-off’ to ‘rock-bottom’; the second, by W. H. Boyd, 
recognizes the important symbolical elements in compulsive gambling. 

Part V, which concludes the book, is mathematical and will probably be above the heads of most 
psychologists. Two chapters deal with blackjack, one with Faro, and one with the more general problem of 
rate of gain in card games when there is sampling without replacement. 

The editor is to be commended for his useful introduction to each part. It is clear that he has carefully 
read each contribution. Of the 33 contributors, no less than 11 are from Nevada itself, 9 from California, 12 
from other Western states, and one from Canada. This scarcely suggests a world view of a topic of 
worldwide interest. The price, $34.75 might exceed the resources of an English purse. 

JOHN COHEN 


Deviance and Control. By Terence Morris. St Albans: Hutchinson Educational. 1976. Pp. 157. £2.45. 


Terence Morris has been a significant figure in British criminology for nearly 20 years and this new book is 
interesting in a number of respects. It can be seen as a benchmark in his biography: the reflective 
considerations of a scholar who on his own admission was ‘nurtured in a positivistic cocoon’ but has now 
come to share the essentially moral concerns of the various sociologists of deviance who have emerged in 
recent years. It is incorrect to brand the book as a ‘controversial reply’ to those radical theorists of deviance 
as the publisher’s come-hither on the back cover says. To do so is to invite disappointment among expectant 
purchasers. 

Such radical theorists, contrary to Morris’s expectations, may be pleased at the extent to which he has 
gone through a process of conversion. The objects of no direct confrontation whatsoever, they are 
sometimes cited approvingly, and the final whimsical comment of the text is that the devil (i.e. materialist 
criminology) seems ‘to have a monopcly of the best tunes’. His rejection of positivist criminology, his 
disillusionment with value freedom, and an awareness of moral pluralism are constant themes throughout the 
book. The views expressed are more likely to be regarded as heretical by those criminologists and 
practitioners whose clinical approach denies the moral status of the deviant, who is reduced to the status of 
patient under the guise of treatment, and by those big-wigs of the legal profession who seem to believe that 
. the criminal law has a monopoly on immorality or on socially injurious conduct of a serious nature. 

The. best chapters are the first three: densely argued, full of ideas, and often enriched by a keen sense of 
history. The first sees the historical development of the nation state as involving a recasting of the 


L 


Book reviews 269 


conception of heresy. A new concensual morality was necessary, one that would be consistent with the new 
set of property relations. The belief in a single moral community, beautifully illustrated by Scouting for 
Boys, has since been eroded by the belief in civilized democracy and the recognition of moral pluralism. In 
an excellent chapter on legal aspects of the deviation defining process, often ignored in books of this kind, 
Morris shows convincingly how selective prosecution and the erosion of mens rea means that the crimes of 
the respectable are less stigmatizing, in spite of being more damaging in many cases than a wide range of 
‘traditional’ crimes. With a nice touch of irony he notes how such offenders rarely have their behaviour 
interpreted by the concepts of clinical criminology. 

It is perhaps unfortunate that the book does at times lapse into a ‘textbook’ style, in spite of intentions to 
do otherwise. The chapter on problems of measuring crime and deviance contains simply ‘standard’ material 
and seems out of harmony with the earlier discussion. Similarly, later chapters include a critical survey of 
the unrewarding attempts to find that magic ingredient that makes some people deviant, ranging from early 
physiological theories to the recent research of West. The treatment of social theories is not intended to be 
comprehensive and eventually revisits the earlier themes of white collar crime. The book finally returns to 
the inherent contradiction in the apparatus of control - between punishment, which stresses responsibility 
and blame, and treatment, which denies moral choice. These final three chapters, especially the last, are 
broadly consistent with his major themes. Much of the later discussion is perhaps not direct enough for the 
new student, but most of the book can be strongly recommended as an excellent discussion of the wider 
issues which are often neglected by those psychologists involved in the study and control of deviance. 
CLIVE COLEMAN 


Animal Models in Human Psychobiology. Edited by George Serban and Arthur Kling. Plenum Press. 1976. 
Pp. xiv+297. $22.50. 


Eighteen speakers, ranging from experimental neurophysiologists to ecologists met in New York to discuss 
the application of studies on non-human animals to the behaviour (‘normal’ or ‘abnormal’) of man. How far 
can observations, experiments and theories based on animal studies be extrapolated to man? In some 
branches of physiology this extrapolation is a necessity because of experimental procedures which are 
physically or psychologically too destructive to be used on man. In what senses can another animal, its 
behaviour, its nervous system and its social life be a ‘model’ for man’s? 

All the authors seem to approve, with varying degrees of caution, the use of animals as models. The 
ethologist Eibl-Eiblsfeldt, accustomed to the principles of analogy and homology in animal morphology, and 
having demonstrated that some behaviour patterns in mammals are innate, looks for analogies and 
homologies in human behaviour. This approach seems to have had its successes in comparative studies of the 
parent-infant relation in monkeys and man. Hinde indeed argues for the wider extension of the comparative 
approach. Other contributors use non-human animals to discover the neurological properties of chemical 
substances (Irwin) or to develop telemetric or remote brain stimulators (Delgado). Here the non-human 
animal can be used because evolutionary relationship or biochemical uniformity predicts some similarity 
between the two. Delgado in particular seems very sceptical whether animal models of schizophrenia, for 
example, exist. Both these approaches are based on a belief in a degree of biological uniformity at some level. 

Many of the contributors, however, take a more empirical approach. Some feature or features of 
non-human behaviour is seen to resemble a human condition in symptoms or causation and used in research 
as a substitute for the human condition. D. L. Murphy points out that the symptoms used to characterize a 
human condition are, in fact, only some of its features or at least are not known to be all its features. The 
animal condition also has presumably additional features and is perhaps unlikely to be a ‘simplified replica’. 
The animal is not a model in the sense that it comprises a known selection of components and no others, but 
in the sense that it is cheaper or more convenient in use than a human subject. Many of the articles, indeed, 
describe some success in using animals in this way to investigate partial analogues of depression, 
hyperkinesis and schizophrenia. 

This degree of success will ensure that non-human animals are used in this sort of research, as they are in 
physiological research. Human psychobiology badly needs animal models, and needs to define the limits to 
their use. Whether this will arise from a wide-ranging comparative study of animal behaviour, as Hinde 
hopes, is more doubtful. This book may perhaps promote discussion among psychobiologists of these 
important points. 

JOHN SUDD 
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The Genetics of Behavior. By L. Ehrman and P. A. Parsons. Sunderland, Mass.: Sinauer Associates. 1976. 
Pp. viii--390. £11.30. 


This book, intended as a text for a course in behaviour genetics, illustrates the problem of the textbook 

. writer in an interdisciplinary science. In this instance, should a background in genetic or in behavioral 
methodology be assumed? The solution adopted by Ehrman & Parsons differs substantially from that of 
another recent textbook in the field, Introduction to Behavioral Genetics by G. E. McClearn & J. C. 
DeFries, thus providing students of this rapidly growing science with a desirable choice between alternatives. 
The earlier book was designed to instruct students in the application of different genetic analytical methods 
to behavioural phenotypes. No previous genetic knowledge was assumed and the book has proved valuable 
for students of psychology. The present book, by two authors who have contributed significantly to research 
on the behavioural biology of the fruit fly genus, Drosophila, appears to have been written mainly for the 
biologist, with a background in basic genetics but not psychology. 

The book is written in two approximately equal parts. The first proceeds by presenting a series of 
examples of the application of genetic principles to behaviour. The level of analysis is not very deep, and 
many of the examples have been drawn from earlier studies which are important for historical reasons, but 
are at best suspect on methodological grounds and have been superseded by advancing techniques. The 
second part which reviews research on those species chiefly studied by behaviour geneticists, is organized 
along taxonomic lines. Such an organization inevitably results in an overall impression of a lack of theoretical 
integration in the field, as topics are scattered rather haphazardly throughout the book according to whether 
they illustrate research on single genes, chromosome effects, quantitative genetics, Drosophila, mice or 
people. The most obvious example is that of non-random mating, a research area of increasing importance, 
partly thanks to the authors' own efforts. Although cited extensively throughout the book, the work is never 
brought together with theoretical developments into a meaningful integration. 

At times the book shows evidence of sloppy writing or editing: for example, on pages 106-107 the formula 
for the inbreeding coefficient F (which does not measure the ‘rate of inbreeding’) is incorrect, and on page 
131 the wrong formula is given for the heritability coefficient, /*. Even more serious is the sometimes 
misleadingly idiosyncratic choice of illustrative material in the second part of the book. I was particularly 
appalled by a discussion of ‘the behavioral phenotype’ in mice based on reported correlations between 
numerous traits in three peculiar strains of mice. There is no genetic analysis, and no evidence that the 
authors are aware of repeated criticisms of such studies or of the regularity with which hypothesised genetic 
correlations disappear in segregating populations. 

An earlier behaviour genetics text opened with the comment by Caspari that all biological phenomena can 
be considered from two points of view: mechanism and evolution. Today one of the most promising and 
active areas of research in behaviour genetics is the search for gene-behaviour pathways in biochemistry and 
neurophysiology, a set of topics which is given little space in the present volume. A second area which has 
already created great interest and promises much more is the study of the evolution of social behaviour, not 
mentioned until the last page. It is a pity that Ebrman & Parsons could not have injected a little more of the 
excitement of this new discipline into their text. It is nevertheless a readable book which should appeal to 
the biologically oriented audience for which it was mainly written. 

P. A. TYLER 


Eye Colour, Sex and Children's Behaviour. By A. L. Gary and John Glover. Chicago: Nelson-Hall. 1976. 
Pp. xii+170. $10.00. 


At the outset I should make it clear that I found this book to be a stimulating and enjoyable account of what 
appears to be an ongoing research project, and believe that the material presented by the authors should be 
regarded as a research report, rather than as a collection of well-established and well-documented findings. 
As Gary & Glover indicate, their work was inspired by Worthy's (1974) theories that a neglected factor in 
psychological investigation is that of eye colour, and that there are two general categories or ‘styles’ of 
responses found in both humans and animals, both of which are related to eye pigmentation. The two 
constructs are ‘self-paced’ and ‘reactive’, and are defined thus: self-paced behaviour is that which occurs 
when the stimulus conditions are relatively fixed and when the organism has broad time limits in which to 
respond; reactive behaviour is that which occurs when the stimulus conditions are rapidly changing and when 
the organism must respond quickly for success. The particular aspect of Worthy's original formulation that 
Gary & Glover examine is that these two response styles are highly correlated with eye colour, dark-eyed 
individuals showing a reactive response style, and light-eyed individuals displaying a self-paced style. 
Consequently, a wide range of behaviours should also show a strong correlation with eye colour. 
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Superficially, this appears to be extremely unlikely, a point the authors readily accept. However, they 
support their hypothesis with a wide array of experimental data showing eye colour to be a significant factor 
in non-verbal communication, creativity, modelling behaviour, social behaviour and sociability, specific 
learning abilities and disabilities, personality and developmental skills. Although the factor of sex is also 
included in their experimental designs, this is treated relatively briefly, although occasionally there is 
comment on the presence of sex by eye colour interactions. In general terms, the conclusion reached 1s that 
dark-eyed subjects are more sensitive to immediate external influences than light-eyed subjects, and that 
females are more sensitive to such influences than males. Clearly, the list of topics covered is an extensive 
one, and reflects the author’s belief that the factor of eye colour and its relation to reactive and self-paced 
behaviours is fundamental to behavioural development, and that inclusion of this as a factor in experimental 
design would lead to a reduction in uncontrolled error terms. 

In addition to a presentation of data, Gary & Glover are not slow to speculate on the implications of their 
findings for the establishment of retraining programmes for those with learning disabilities, as well as 
postulating alternative methods of education in the classroom setting. In this latter context, the authors 
suggest methods which parents might adopt in seeking to change educational procedures, and some sections 
of the final chapter take on the feeling of an educational crusade. It is in this area of speculation that the book 
is least satisfactory. Throughout, the authors take pains to stress that their findings are based on group 
research; that as such they cannot be applied to the individual; that their main interest is in individual 
differences, and in the provision of environnlents which will suit the requirements of the individual. However, 
in the final chapter there is a tendency to move away from this attitude and towards a categorization based 
on eye colour differences. In this context, Gary & Glover also appear to have fallen into the trap of 
‘dichotomania’ which has ensnared many psychologists. Thus, their particular dichotomy tends to be in 
terms of light-eyed or dark-eyed, with relatively little consideration of other factors, apart from that of male 
or female. Of relevance here is that their treatment inevitably leads into the nature-nurture form of 
dichotomy in that eye colour is an inherited characteristic. They consider too briefly the evolutionary, 
genetic and physiological implications of their arguments, or whether their theories imply strong biological 
influences on behavioural styles. They do not adequately come to grips with this problem, although they 
rightly indicate that genetic influence, in the behavioural sense, merely implies a predisposition, which may 
be subject to developmental change, is not immutable, and which can be greatly modified by environmental 
contingencies. Unfortunately, although this point would be readily accepted by geneticists, it is one which 
has been widely misunderstood by many psychologists, and is not sufficiently emphasized in this book. This 
lack of emphasis could lead to some of the arguments presented being misinterpreted. 

Clearly, many of the points made in the book are potentially controversial, and before accepting many of 
the speculations, one would wish to see a close examination of the findings reported. À search through the 
recent literature revealed relatively few studies in which eye colour was treated as a relevant factor, and of 
those studies doing so. the findings are not clear-cut. Thus, although Markle (1975) provides experimental 
support for these theories, Nisbett & Temoshok (1976) fail to do so. In addition to this, Gary & Glover 
provide evidence on page 78 in a study on eye colour and responsiveness in the classroom which runs 
counter to their predictions. They do, however, provide an ingenious explanation for this, and subsequently 
provide additional experimental data to support their argument. 

Despite these criticisms, this book is of potential value in stimulating research. A great deal of 
experimental evidence is presented, and the authors provide much detail on how the studies were carried 
out, and make a request that other psychologists investigate the phenomena they report. It is here that the 
strength of the book lies. The findings reported are, in many respects, very surprising and as such would 
benefit from further investigation. They should not, however, be regarded as a final statement on established 
fact, rather as indicating avenues of possible research. 


D. F. SEWELL 
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A preliminary study of alternative taste languages using qualitative 
description of sodium chloride solutions: Malay versus English 


M. O’Mahony and H. Muhiudeen 


Monolingual English and Malay speakers as well as bilinguals gave qualitative descriptions of salt solutions 
and their descriptive strategies were examined. Malay speakers made greater use of the modifying compara- 
tive phrase than English speakers, thus making use of a strategy that may help prevent confusions in 
English. Such a strategy may benefit taste panels. 


Qualitative descriptions of aqueous solutions of the so-called primary taste stimulus, sodium 
chloride, have varied from the traditional ‘salty’ description with such terms as ‘sweet’, ‘sour’, 
‘bitter’, ‘soapy’, ‘tasteless’ and ‘indefinite’ (Camerer, 1885; Blakeslee & Salmon, 1935; King, 
1937; Kahn, 1951; Bartoshuk, McBurney & Pfaffmann, 1964; Bartoshuk, 1968). Again, a wide 
range of descriptions is available for the taste of the solvent, distilled water (Brown, 1914; King, 
1937; Bórnstein, 1940; Anderson, 1959; Bartoshuk et al. 1964). When subjects had descriptive 
terms, including ‘primaries’, suggested to them but were allowed to use their own words 
(O'Mahony, 1973; O'Mahony & Godman, 1973 b), many novel descriptions were used. 

The use of indeterminate and novel categories as well as attempts by subjects to string 
together descriptive terms provided; ‘bitter-sweet’, 'bitter-smooth-salty ' (O'Mahony, 1973; 
O'Mahoney & Godman, 19734) points not only to a lack of skill in the use of language by 
subjects in describing tastes but also to possible deficiencies in the language itself, as far as 
unequivocal taste quality description is concerned. 

Perhaps languages other than English may have better strategies for distinguishing and 
describing tastes. Unfortunately, there has been very little systematic research concerning taste 
languages other than English. Relevant studies (Chamberlain, 1903; Myers, 1904) have been 
concerned primarily with examining the evolution and confusion of taste words in so-called 
primitive cultures. 

Malay speakers, however, are said to make more use of the modifying comparative phrase 
than English speakers, when describing taste. Instead of merely describing a solution as ‘masin’* 
(salty) or ‘manis’ (sweet), they will tend to use descriptions like ‘masin ayer laut’ (salty 
like sea water), ‘masin kitchup' (salty like soy sauce) or ‘manis gula’ (sweet like sugar) and 
*manis buah' (sweet like fruit). This study is a preliminary investigation of any differences that 
may exist between English and Malay speakers in the qualitative description of sodium chloride 
solutions, with a special emphasis on the use of the modifying comparative phrase. 


Method 


Subjects were female students (20-25 years) from a single residential college of the University of Malaya, 
giving a measure of control for diet. There were 11 English-speaking subjects, 10 Malay speakers and 4 
bilinguals. A monolingual was defined as one who had either or both parents speaking one language and had 
been educated in that language to university level. A bilingual was defined as having Malay-speaking parents 
but had been educated in English. Each single-language subject performed the experiment once, entirely in 
her own language while bilinguals performed it once in each language, the order of language conditions being 
counterbalanced between subjects. Racially, Malay speakers were Malay, English speakers were Chinese and 
Indian, while bilinguals were Chinese and Malay. A larger sample of subjects could not be obtained owing to 
closure of the university during experimentation. 


* [n this paper, all Malay words are written using the old spelling and not the new phonetic spelling. 
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Subjects were required to sip every 30 sec approximately 10 ml of an ascending series of sodium chloride 
solutions (distilled water, 4, 8, 12, 16, 20, 30, 40, 60, 80 and 100 mM concentrations). Subjects held stimuli in 
the mouth approximately 34 sec, expectorated, and then gave a qualitative description of the stimulus, 
describing it as fully as possible. In the instructions, no taste adjectives were provided for subjects and care 
was taken not to suggest any, so that descriptions would be spontaneous and not supplied by the 
experimenter. Six such series were tasted with a tapwater mouthrinse and a 1 min rest between series to 
minimize the effect of residuals (O'Mahony & Godman, 1974; O'Mahony & Wingate, 1974). During the rest 
interval, tastes described in the previous sequence were read back to the subject by the experimenter. 

Solutions were made up with Analar grade sodium chloride in distilled water and were presented at room 
temperature (26-28 °C). Practice was given prior to experimental sessions with distilled water, 80 mM and 
100 mM sodium chloride solutions; experimental sessions lasted approximately 40 min. 


Results 


The frequencies of use of descriptive terms, computed over the six ascending series for each 
subject, are shown in Table 1. Each descriptive term was scored separately; thus strings of three 
words, ‘triplets’ (O'Mahony & Godman, 1973 a), were scored as three and not one word. 


Table 1. Total number of descriptions for various descriptive categories given by subjects with 
NaC] as a stimulus 


Mean Number of subjects 
Description number c giving descriptions 
English monolinguals, n — 11 
Total descriptions, all types 104-5 19-7 11 
Salty only 28-8 7-] 11 
Salty with comparative phrase 0-1 — 1 
Total salty type 28-9 7-7 11 
Malay monolinguals, n= 10 
Total descriptions, all types 107-9 22-6 10 
Masin only 18-7 6-9 10 
Modified masin 12-6 6-8 10 
Total masin type 31:3 8.9 10 
Bilinguals, n= 4 
Total English descriptions, all types 95:0 25.1 4 
Salty only 31-5 77] 4 
Salty with comparative phrase 0-25 — 1 
Total salty type 31-75 8.1 4 
Total Malay descriptions, all types 119-0 56.5 4 
Masin only 19-5 7-0 4 
Modified masin 10-5 73 3 
Total masin type 30-0 10-9 4 


The total number of descriptions used, regardless of whether they were salty descriptions or 
not, was not significantly greater in one language than the other (f test, P> 0-05); even the larger 
difference seen in the bilingual group was not significant although this may have been due to the 
small size of the sample. 

Considering the number of ‘salty’ type descriptions for each subject, it can be seen that for 
both monolinguals and bilinguals the total number of terms used was the same in Malay and 
English (t test, P» 0-05). However, it can be seen that few English speakers differentiated 
‘salty’ descriptions by a comparative phrase (salty like sea water) while the Malay speakers 
would often differentiate ‘masin’ (salty) from other salty tastes. They used the terms: ‘masin 
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ayer laut’ (salty like sea water), ‘masin garam’ (salty like salt), ‘masin kitchup’ (salty like soy 
sauce), and ‘masin maung’ (salty, obnoxious) a mean number of 1:9, 7-1, 2-1 and 1-5 times 
respectively. This greater differentiation is reflected in the fact that while there are an equal 
number of total ‘salty’ type and ‘masin’ type descriptions in Malay and English, there are more 
‘salty’ than ‘masin’ or modified ‘masin’ descriptions (t test, P< 0-05). Within Malay, there is 
still a greater tendency to use ‘masin’ than all the modified ‘masin’ descriptions put together, 
but this trend is only just significant for monolinguals (f test, P< 0-05, one-tailed only) and not 
significant for bilinguals. Within English ‘salty’ is used almost exclusively without modification. 
Thus Malay speakers do not even use the modifying comparative phrase in a majority of their 
descriptions but they do modify their descriptions enough to provide considerably more 
discriminative categories. Although the English language has the ability to accomodate such 
comparative phrases, it appears that English speakers do not use this strategy. 

Similar trends were found with other descriptive categories but not to such a significant extent 
because the majority of descriptions for sodium chloride were naturally of the ‘salty’ type. For 
instance, in English the word ‘sweet’ was never differentiated while in Malay ‘manis’ (sweet) 
was often differentiated to ‘manis buah’ (sweet like fruit) and ‘manis gula’ (sweet like sugar). 


Discussion 


The results obtained may only be treated as tentative indications, owing to the numbers of 
subjects used. It is unfortunate that circumstances beyond the control of the experimenter 
forced a termination of the experiment. 

Examining the words used in English, the so-called primary tastes, ‘salty’, ‘sweet’, ‘sour’ and 
‘bitter’ were used as descriptive terms in the majority of cases (56 per cent) while other 
commonly supplied terms, ‘tasteless’ and ‘indefinite’ made up a further 11 per cent. This is 
interesting because these descriptions were spontaneous, having neither been suggested to 
(O'Mahony, 1973) nor imposed on (Bartoshuk et al. 1964; Bartoshuk, 1968) the subject. The 
remaining third were novel terms, some of which had been noted before: ‘flat’, ‘metallic’, 
‘soapy’ (Brown, 1914; O'Mahony, 1973) and others which had not. These latter were formed 
following previously noted strategies (O'Mahony & Godman, 1973 a) like attempting to describe 
the sensation: ‘stingling’, ‘grasping’, or likening it to a known stimulus: ‘tarry’, ‘rose syrup’. 
There is no previous literature on Malay taste descriptions for comparison, but once again the 
English primary tastes, with or without the modifying comparative phrase, account for a 
majority of descriptions (72 per cent). The remaining terms were formed using the previously 
noted descriptive strategies: attempting to describe the sensation, ‘pedar’ (vibratory), ‘lekat’ 
(thick, clinging), or likening it to a known stimulus ‘sabun’ (soap), ‘assam’ (assam fruit). 

The most important finding is that these Malay speakers tended to make far more use of the 
modifying comparative phrase than English speakers, giving a greater apparent differentiation. It 
is possible however that Malays may be inconsistent in their use of such modifying phrases, 
causing confusions such as the bitter-sour confusion in English (McAuliffe & Meiselman, 1974) 
or the reported salt-sour confusion among Torres Straits islanders (Myers, 1904). The strategy, 
however, provides a useful means of taste discrimination which can be modified and evolve with 
greater experience. Why the Malay speakers should use this strategy is not apparent. Why 
English speakers do not use the strategy is more easily understood in terms of the small part that 
taste description appears to play in English culture compared to say, colour. The strategy, 
however, can easily be adopted in English and reports of its use have been noted (O’Mahony & 
Godman, 1973 a) for water tastes. 

Such use of the comparative phrase would be useful in the definition of terms for organoleptic 
measurements which involve qualitative description like flavour profiling (Cairncross & Sjéstrom, 
1950; Sjóstróm, Cairncross & Caul, 1957) and its modifications (Cartwright & Kelley, 1951; Hall, 
1958). Although such procedures involve open discussion to clarify descriptive terms, it is still 
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possible for flavour labels to be used differently by panel members; a descriptive adjective with a 
comparative phrase could be less equivocal and avoid confusion. The variables for this technique 


are understudied (Hall, 1958; O'Mahony & Thompson, 1976) but the state of the art is 
summarized in several extensive reviews of the method (Caul, 1957; Caul, Cairncross & 
Sjóstróm, 1958; Amerine, Pangborn & Roessler, 1965). 

Even though this study is merely a short preliminary exploration, it has revealed strategic 
differences between English and Malay in the description of salt solutions. It remains to be seen 
whether casual reports that the modifying comparative phrase is used to describe tastes other 
than those elicited by sodium chloride are true, although it may be expected from the use of this 
strategy to describe sweet tastes elicited by the stimulus. It also remains to be seen whether 
Malay speakers are affected in their taste discrimination by their linguistic strategies. 
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Experimenter bias against subliminal perception? Comments on 
a replication 


Donald P. Spence and Gudmund J. W. Smith 


Criticizes three replications of experiments on subliminal perception for not being true replications. The lack 
of subliminal effects in the replications could very well be explained by a change from sensitive to insensitive 
instruments of measurement. 


Certain aspects of a recent study by Barber & Rushton (1975) bring its status as a replication 
into question. Seeking to test the variable of experimenter bias, Barber & Rushton attempted to 
replicate and extend two subliminal studies (Smith, Spence & Klein, 1959; Spence & Holland, 
1962) and compare conditions where the experimenter had knowledge of the stimulus with those 
where he did not. A significant interaction between knowledge of stimulus and amount of 
subliminal effect would suggest that subliminal effects are mediated, at least in part, by 
experimenter bias. Careful replication is a necessary first step in carrying out such a programme 
because unless there is a clear subliminal effect, it would be difficult to interpret the interaction 
with bias. It is on the question of replication that we wish to take issue with Barber and 
Rushton. 


Experiments I and H 


The original study by Spence & Holland (1962) tested the subliminal effect of the stimulus word 
CHEESE by comparing recall of cheese-associates with recall of control words. The recall list 

was composed of 26 words which consisted of three different categories: buffer words at the 
beginning and end (three each) to absorb the effects of primacy and recency; and ten cheese- 
associates and ten control words, matched for frequency and alternated in position in the central 
part of the list. Recall favouring associates or control words could not, Unc i these conditions, 
be attributed to position effect or to the influence of primacy or recency. 

In their replication, Barber & Rushton abandoned the fixed list of the original study and relied 
instead on a shuffled deck of cards, shuffled anew for each subject. The shuffling procedure only 
approximates randomness, and paradoxically, the more carefully it is carried out, the less 
random the outcome. Gardner (1975) has shown that a deck of 20 cards can be brought back to 
its original order in six perfect shuffles. Barber & Rushton state that the deck of word cards was 
shuffled thoroughly for each subject; we have no definition of ‘thoroughly’, and no data on the 
actual word orders are given in the different experimental conditions. Furthermore, with no 
buffer words, there is no control for the strong effects of primacy and recency; as a result, the 
subliminal effect is being pitted against a much more powerful list effect, to the detriment of the 
former. This problem is independent of the nature of the shuffle; some associates and some 
control words must end up in first and last positions, and their recall cannot in all likelihood be 
attributed to either experimental bias or subliminal stimulation. As a result, the effective list size 
is reduced to 14 words, and if this set is unbalanced for the position and/or frequency of 
associates and control words, a serious random bias is being introduced. 

More generally, the experimenters must show that there is no significant interaction between 
word order and experimental condition. Suppose, for example, that cheese-associates occurred 
more often in positions 1-3 and 18-20 during the blank condition. Their recall would be inflated, 
relative to control words, simply by virtue of their position. Since the higher recall occurred 
during the blank condition, it would tend to depress the cuexrecall interaction, minimizing the 
subliminal effect. 


280 Donald P. Spence and Gudmund J. W. Smith 


The fact that the replication of the original study was flawed interferes with the larger purpose 
of the paper. The authors’ premise may be correct - some of the subliminal effect may, in fact, 
be due to demand characteristics imposed by. the experimenter — but to make this point, they 
need to begin with a procedure which is free from artifact. It is particularly important to show 
that the conditions which exist when the experimenter is aware of the stimulus are exactly the 
same as the conditions which exist when he is not; otherwise, the contribution of éxperimenter 
knowledge is hard to interpret. Such a parallel does not exist in the present study. The 
uncontrolled variation in word order, together with the lack of buffer words, may actually have 
prevented the experimenters from showing an experimenter effect. The replication needs 
replication - but one more carefully conforming to the conditions of the original experiment. 


Experiment I 

The third experiment should also be called a ‘semi-replication’ — this time of a study by Smith et 
al. (1959). In the original study, the words HAPPY and ANGRY were flashed tachistoscopically ina 
mixed sequence of gradually increased subliminal exposures, each alternating with a drawing of 
a relatively expressionless face. When the subjects’ descriptions of the face were Classified, they 
were clearly more pleasant, etc., in HAPPY pairings than in ANGRY pairings. However, although 
this finding has been replicated by Somekh & Wilding (1973), such an effect did not reappear in 
the Barber & Rushton experiment. 

One obvious factor behind this lack of subliminal effects seems to be the scoring method. The 
original study allowed the subjects free verbal expression of their impressions of the supraliminal 
face (preceded by either of the subliminal words). The replication used a fixed quantitative scale 
with extremely happy and extremely angry as opposite poles. This methodological change cannot 
be fully appreciated by the reader unless he also knows that the original subjects did not differ 
along a one-dimensional happy-angry continuum. Instead, the word HAPPY tended to facilitate a 
wide spectrum of contented, calm, positive moods, and ANGRY more sad, complex, negative 
ones (only seldom outright angry expressions). Confronted with a fixed list of alternatives, the 
replication subjects may very well have found that no alternative really fitted their impression of 
the supraliminal face and, therefore, tended to choose more or less at random or to prefer the 
non-comittal middle range of the scale. 


Conclusion 


It might be argued that the changes in design between the original studies and the replications 
reflect an unconscious bias against subliminal perception. What at first glance looks like 
methodological improvements seem, at the same time, to impose restrictions on the 

possibilities for subliminal effects to be expressed in the experimental data. Barber & 
Rushton’s commendable attempts to replicate two key experiments on subliminal perception are 
regrettably marred by their tendency to reduce the sensitivity of the measuring instruments. 
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Experimenter bias against subliminal perception? A rejoinder 
Paul J. Barber 





The basis for Spence & Smith’s (1977) critique of three replications of subliminal perception experiments is 
judged to be conjectural. The case that the replications were less sensitive to subliminal perception 
effects than the original experiments is evaluated and rebutted. 





Spence & Smith’s (1977) comments are addressed to some procedural changes between the 
original studies and Barber & Rushton’s (1975) attempted replications that they judge to have 
made the replications less sensitive to subliminal perception effects. 


Experiments I and II 

Two procedural aspects of the replications of Spence & Holland's (1962) experiment are 
discussed by Spence & Smith. Both concern the recall procedure; the first has to do with the 
word order used, and the second with the effects of primacy and recency. 

Both changes in the replications were made because Barber & Rushton misconstrued the 
reasons for including buffer items at the beginning and end of the recall lists in the original 
study. There, since subjects were tested in a group, a single word order had to be used for all 
conditions. In the replications (Expts I and II) an individual testing procedure was adopted (for 
the sake of increased procedural sensitivity) and so different word orders could be used (for the 
sake of generality). To this end a card-shuffling technique was used to generate a new word 
order for each subject. 

At the same time the buffer items were omitted because (a) they would have entailed an 
additional small but potentially troublesome operational detail for the assistant experimenters to 
cope with; and (b) there seemed no reason to suppose that shuffling would not be effective in 
distributing primacy and recency effects evenly across the several experimental conditions. 

Spence & Smith, on the other hand, question the effectiveness of shuffling as a randomization 
device and observe that a perfect shuffle can restore the original order of a deck of cards in a 
surprisingly small number of trials. Since ‘perfect’ shuffles can be managed usually only by 
professional magicians or machines, it is hard to see that this criticism has much force. 

But the fact is that the word orders were not recorded (again so as not to encumber the 
assistant experimenter with what were then judged to be trivial details) and so it is possible that 
Spence & Smith were correct in speculating that there may have been crucial differences 
between the word-orders for the various conditions that worked against the subliminal 
perception hypothesis. A solution would be for the same set of word orders (and surely more 
than one is needed for the sake of generality) to be used in all conditions. Nevertheless it would 
be a mistake to set too much store by this particular criticism since it does seem in the 
circumstances to be unlikely. The example given by Spence & Smith of how the subliminal 
effect could be minimized ~ by cheese-associates occurring more often in the primacy and 
recency regions of the list when the blank stimulus was presented - would have to happen not 
once but twice in separate experiments since the subliminal effect failed to appear in both Expts 
I and II. The imbalance (between subliminal and blank conditions) would have in the first case to 
hold up over two groups of 24 subjects, and in the second over two groups of 20 subjects. Even 
allowing for other ways of producing the same effect - for example, for the primacy and recency 
regions to have too few cheese-associates in the subliminal condition - the case against Expts I 
and II seems to require such a convoluted set of unfavourable circumstances that it may be 
judged improbable at best. 
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In the replications the primacy-recency buffer items were excluded and so serial position 
effects operated normally over the whole list. As a result Spence & Smith consider that ‘the 
subliminal effect is being pitted against a much more powerful list effect’ (italics added). But the 
list effect occurs in all conditions, for all subjects, and the subliminal effect is superimposed on 
it and hence ought not to suffer statistically. Moreover, for lists as long as those used in these 
studies, free recall is not typically so high in the primacy-recency regions as to prevent 
differences emerging between experimental conditions. 


Experiment III 

Spence & Smith argue persuasively that the change from the use of free verbal expression in the 
original'study (Smith, Spence & Klein, 1959) to a rating scale in the replication may have 
obviated the subliminal perception effect. However, three points need to be made about this 
change and their criticism. 

First, Barber & Rushton accepted that this was a major difference but they made a case out 
for it, citing three reasons in its favour. Second, this kind of change did not prevent Somekh & 
Wilding (1973) from getting positive evidence from a variant of the HAPPY-ANGRY paradigm. 
Finally, the criticism is again conjectural and it remains true that ‘unfortunately there is no 
convincing independent empirical justification for the use of one method as opposed to the 
other’ (Barber & Rushton, 1975, p. 367). 


Conclusions 


Inevitably the speculative nature of Spence & Smith’s criticisms makes them hard to evaluate 
and it must be conceded that their case may be valid. However, consideration of the case in 
detail suggests that it has essential weaknesses. Moreover it should be noted that several steps 
were taken to enhance the sensitivity of the design of the replication experiments, and Spence 
& Smith’s comments, even if valid, should not therefore be evaluated in isolation from these 
other design improvements. 
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Self-rated imagery and vividness of task pictures in relation to visual 
memory 


Göran H. Berger and Samuel C. B. Gaunitz 





Marks (1973) and Gur & Hilgard (1975) have reported success in predicting performance in visual-memory 
tasks from scores in a questionnaire of self-rated vividness of imagery, i.e. the Vividness of Visual Imagery 
Questionnaire (VVIQ; Marks, 1973). The results obtained were attributed to the use of the VVIQ and 
pictures with a high degree of vividness in the memory task. 

These findings were disconfirmed in two experiments in which the VVIQ was used and vivid pictures were 
presented in the memory tasks. Subjects were to judge whether pairs of similar, successively presented 
pictures were identical or not. The results indicated that subjects rated as ‘good’ imagers did not perform 
differently from those rated as ‘poor’ imagers. The influences of demand characteristics in the previous 
experiments and differences in experimental procedures were referred to as possible causes of the observed 

inconsistencies in results. It is also suggested that questionnaires of self-rated visual imagery are ineffective 
: as predictors of performance, since they only cover a limited aspect of imagery. 


It is reasonable to assume that recall and recognition of pictures are mediated by visual-imagery 
processes. Attempts to relate self-reported vividness of imagery with performance in memory 
tasks have, however, generated conflicting and inconclusive results (for a further discussion, see 
Paivio, 1971). Recently, however, Sheehan has succeeded in predicting performance in memory 
tasks by using a revised form of Betts's Questionnaire upon Mental Imagery (Betts, 1909; 
Sheehan, 1967 a, b). These main findings were not replicated when control procedures similar to 
the ‘double-blind’ design were introduced to reduce the influences of demand characteristics 
(Sheehan & Neisser, 1969). Thus, for example, none of the subjects had previously taken part in 
studies of imagery, and none knew the purpose of the experiment. Further, the subjects were 
assigned to experimental groups in such a way that the experimenters did not know who were 
high and who were low scorers. 

In later reviewing previous research, including the Sheehan & Neisser study, Neisser (1970) 
concluded that self-reported vividness has a poor predictive value as regards accuracy of 
performance. 

The conclusion reached by Neisser was challenged in a study by Marks (1973) and in a study 
by Gur & Hilgard (1975). Marks's assumption was that the difficulties in predicting performance 
in visual-memory tasks could be attributed to two major obstacles. Firstly, the questionnaires 
conventionally used contained items referring to several sensory modalities. Secondly, the 
pictures presented in the memory tasks were lacking in vividness, which may have deactivated 
important variables related to visual memory, like meaningfulness, affect and interest. 

Marks constructed a new questionnaire, the Vividness of Visual Imagery Questionnaire 
(VVIQ), which consists of only visual items (Marks, 1972). This questionnaire contains 16 items, 
some of which were borrowed from Betts and some of which were constructed by Marks 
himself (Marks, 1972). The items refer to common stimulus situations and the subject's task is to 
rate the vividness of the visual imagery that the items presented may evoke. 

In the study by Marks, a series of three experiments was performed. The subjects were 
divided into groups of ‘good’ and ‘poor’ imagers, according to their VVIQ scores. A selection 
of coloured pictures representing objects and scenes was presented to the subjects in the 
memory tasks. In the experimental trials, the subjects were asked to recall details concerning 
successively presented pictures. ‘Good’ imagers were more successful in recalling the pictures 
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presented in all three experiments. Furthermore, females were more accurate in recall than 
males. 

Marks does not report whether the ‘double-blind’ procedure (see Sheehan & Neisser, 1969) 
was adopted in his experiments. Thus, it does not appear whether the subjects knew the purpose 
of the experiments or whether the experimenter knew which subjects were low and which were 
high scorers. In Expt. III, the possible influence of demands may have been reduced, since the 
questions asked in each trial were tape-recorded. However, the instruction was not. 

While Marks emphasized the specificity of the VVIQ consisting of purely visual items, Gur & 
Hilgard (1975) pointed out the global character of the VVIQ as a measure of visual imagery. 
According to them, not much should be made of the emphasis on vividness, since it may rather 
reflect a dimension of ‘imagery controllability’ (for a further discussion on this concept, see 
Richardson, 1972), as well as other dimensions of imagery. However, Gur & Hilgard obtained 
results which strongly supported the findings of Marks. Like Marks, Gur & Hilgard divided their 
subjects into ‘good’ and ‘poor’ imagers, according to their VVIQ scores. The subjects were to 
detect the difference between pairs of similar pictures selected from the Meier Art Tests: I. Art 
Judgement (Meier, 1940). The dependent variable was the time needed to detect the difference, 
and this time was measured by one experimenter, using a stop-watch. Under one condition, the 
pictures were presented simultaneously and, under the other, successively at intervals of 20 sec. 
The performance of the ‘poor’ imagers deteriorated when the pictures were presented succes- 
sively, whereas the performance of the ‘good’ imagers remained unaffected by the mode of 
presentation, which further indicated that visual memory was the critical variable separating the 
subjects. 

It is not reported, in the study cited, whether the subjects knew the purpose of the 
experiment, whether the subjects had participated as subjects in previous imagery experiments 
or whether the experimenters were aware of the subjects’ VVIQ scores. Furthermore, expecta- 
tions on the part of the experimenter concerning the subjects’ performances may have influenced 
his recordings of the detection time. The dependent variable measured may also have stimulated 
risk-taking among the subjects. It seems reasonable to assume, for example, that a subject who 
considered himself a good imager would rely more on his assumed imagery capability, when 
somewhat uncertain as to the real difference between pairs of pictures, and therefore react faster 
than a ‘poor’ imager, also somewhat uncertain. 

The present experiments were conducted to find out whether the results obtained by Marks 
and Gur & Hilgard would be confirmed, when a design similar to the ‘double-blind’ design was 
introduced to reduce the influences of demand characteristics. The stimulus materials used in the 
Gur & Hilgard study were presented in the memory tasks. The subjects responded according 
to a forced-choice technique, as applied by Marks, thereby avoiding detection time as the 
dependent variable. The subjects had no prior experience of imagery experiments and they 
did not know the purpose of the experiments. Their VVIQ scores were unknown to the 
experimenter present at the performance tasks. 


Experiment I 
Method 


Subjects. Forty-eight undergraduate students of psychology served as subjects. There were 28 females and 
20 males. 


Stimulus materials. Thirty-six pairs of pictures selected from the Meier Art Tests: I. Art Judgement provided 
the discrimination task. Eighteen pairs were identical and 18 pairs were slightly different. Each pair of the 
different pairs consisted of one picture which was a reproduction of the original work of art and one picture 
which was a slightly different reproduction. Varying shade, perspective or details that were missing or 
changed constituted typical differences between pairs. All pictures were black and white. The stimuli were 
reproduced as diapositive copies and projectors were used to project the stimuli on a screen. 
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Procedure 
The subjects first performed the memory task and then completed the VVIQ. The subjects were generally 
tested in groups of three. 

The 36 diapositives were divided into two groups, in which nine pairs represented identical pictures and nine 
pairs represented slightly different pictures. The two groups of 18 pairs occurred under two experimental 
conditions. The time interval between the removal of the first projected picture in a pair and the screening of 
the second was varied between the two groups of 18 pairs. Both experimental conditions occurred in one 
session. The order of presentation between the conditions was counterbalanced. 

Under both conditions, the subjects inspected the first picture in a pair for 7 sec and the second picture for 
5 sec. Under one condition, the time interval between the removal of the first picture and the screening of 
the second picture was 0-2 sec and, under the other condition, it was 5 sec. The inspection times and 
between-pictures intervals were automatically regulated. After a pair of pictures had been presented, the 
subjects had to judge whether the pictures were alike or not. The subjects made their judgements on a 
questionnaire, using the response categories ‘Different’ and ‘Alike’. The subjects were instructed to give 
answers in all the trials and to make as many correct judgements as possible. Correct responses to different 
and identical pairs of pictures were equally evaluated as one score. The subjects were not to go into any 
detail and received no feedback from the experimenter. The experimenter did not communicate with the 
subjects during the experimental task. 


Results 


The subjects were divided according to the median of their VVIQ scores into two groups 
consisting of 24 ‘good’ imagers and 24 ‘poor’ imagers (X= 56-5, X= 41-8). The dependent 
variable was the number of correct judgements, ranging from zero to 18 for each group of 
pictures. Random guessing would provide the subject with nine correct marks, on an average. 

The results were subjected to a 2x2x2 analysis of variance, with repeated measurements of 
the ‘time-interval’ factor. The analysis revealed no significant effect of either the VVIQ 
(F=2-56, d.f. — 1, 44, P» 0-05) or sex (F=0-18, d.f. = 1, 44, P» 0-05) or the time interval 
(F=3-77, d.f. = 1, 44, P» 0-05). 

If the subjects expected the differences to be more or less distinct than they actually were, one 
would expect a preference for one of the response alternatives. To investigate whether such a 
misinterpretation was marked among the subjects, the individual scores were subdivided into 
numbers of ‘alike’ and ‘different’ responses and a f test was performed on the difference 
(X= 17-2, X= 17-9, d.f. 2 47, t= 1-85, P» 0-01). The results indicated no response bias. 

The mean numbers of correct responses for the different groups for the two time intervals are 
given in Table 1. 


Table 1. Mean number of correct responses in the memory task for ‘good’ and ‘poor’ imagers 
of each sex for the time intervals indicated in Expt. I 


Good imagers Poor imagers 
Time intervals (sec) Males Females Males Females Mean 
0:2 12-9 12:4 12-9 11-9 12-5 
5 13-8 14-1 12-2 13-1 13-3 
Mean 13-4 13-3 12-6 12-5 12-9 


The results of Expt. I were not in accordance with the results of the studies by Marks and by 
Gur & Hilgard. The divergence from their results may have been due to the shorter between- 
pictures intervals used in the present experiment and to the smaller difference in VVIQ scores 
between the non-preselected groups of subjects. À second experiment was performed, in which 
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these conditions were optimized. The effect of the order of presentation as between the VVIQ 
and the memory test was also investigated, since self-rated vividness after performance ina 
memory task may be affected by the subjects’ knowledge of the results (Marks, 1972). 


Experiment I 

Method 

Subjects. Forty-eight undergraduate students of psychology and other university students served as subjects. 
There were 34 females and 14 males. 


Procedure 

The stimulus materials and the apparatus were identical with those used in Expt. I. The subjects 

constituted two groups of 24 individuals. The subjects in one group were preselected from a sample of 99 
individuals, according to their extremely high or low VVIQ scores. These subjects then performed the 
memory task. The other group, consisting of 24 subjects, completed the VVIQ after they had performed the 
memory test. 

Each subject was presented with the 36 pairs of stimuli, in which 18 pairs were identical and 18 pairs were 
slightly different. The time interval between the removal of the first picture in a pair and the screening of the 
second was 20 sec. The subjects’ responses were given as in Expt. I. The experimenter did not communicate 
with the subjects during the experimental task. The subjects’ VVIQ scores were unknown to the experi- 
menter. 


Results 


The results were subjected to a 2x2 factorial analysis of variance, with the two factors 
‘imagery’, i.e. good or poor imagery, and ‘presentation order’, i.e. the VVIQ presented before 
or after the memory task. The dependent variable was the number of correct judgements ranging 
from zero to 36, where random guessing would provide 18 correct marks, on an average. The 12 
‘good’ preselected imagers had an average VVIQ score of 63-3 and the 12 ‘poor’ preselected 
subjects averaged 40-0. The 24 preselected and non-preselected subjects classified as good 
imagers (X= 60-5) did not perform differently from the 24 preselected and non-preselected 
‘poor’ imagers (X= 43-5, F=0-14, d.f. = 1, 44, P» 0-05). Neither the presentation order 
(F=0-85, d.f. — 1, 44, P» 0-05) nor the interaction between the two variables (F= 1-84, 
d.f. — 1, 44, P» 0-05) had any effect. The subjects did not display a preference for the response 
alternative ‘alike’ or ‘different’ either (X= 17-6, X= 18-4, d.f. = 47, t=0-70, P> 0-01). 

The mean number of correct responses in the different groups for the two presentation orders 
of the VVIQ are given in Table 2. 


Table 2. Mean number of correct responses in the memory task for ‘good’ and ‘poor’ imagers 
when the VVIQ was presented before and after performance in Expt. II 


NN 
Good imagers Poor imagers Mean 


oo o o m I IIIma 


VVIQ presentation 

Before the memory task 26:0 252 25-6 

After the memory task 25-6 26-9 26-3 
Mean 25.8 26:1 26-0 


nn — TTT 


The results obtained in Expt. II indicate that the manipulation of the interval between 
inspection and test picture did not produce a difference in performance between ‘good’ and 
‘poor’ imagers. Further, the order of presentation between the VVIQ and the memory test did 
not effect self-rated vividness or accuracy of performance in the memory task. 
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Discussion 


Self-rated vividness of imagery in the VVIQ did not successfully predict the subjects’ 
performance in the memory tasks in the present two experiments. Performance was not 
affected by the different time intervals between inspection and test picture (i.e. the 0-2-, 5- and 
20-sec intervals). Female subjects did not perform differently from male either, which contradicted 
the observations of Marks (1973). The present experimental results are not in accordance with 
the results of the studies by Marks and by Gur & Hilgard (1975). 

The differences in VVIQ scores between ‘good’ and ‘poor’ imagers were of the same 
magnitude in all three studies. The pictures in all the investigations were selected as being 
lifelike and vivid. Why, then, are the present experimental results not in accordance with those 
obtained in the previous studies? If the non-significant differences obtained in the present 
experiments were due to procedural fallacies, then a comparison with the procedure used by Gur 
& Hilgard might clarify the issue. 

The stimulus materials presented in the two studies were similar: the pictures were selected 
from the Meier Art Tests: I. Art Judgement (Meier, 1940). However, different dependent 
variables were used. In the present experiments, the administration of the stimulus materials and 
the recording of the response were more standardized, thereby providing better control of the 
experimental situation, for example, reducing the communication between experimenter and 
subjects. Furthermore, a subject’s confidence in his capacity to form images, which may be 
supposed to be reflected in his VVIQ ratings as well as in his detection-time performance, may 
also have affected the results in the Gur & Hilgard study. By using a forced-choice design in the 
present experiments, risk-taking attitudes among subjects were eliminated as a possible cause of 
different performance between ‘good’ and ‘poor’ imagers. 

The refinements of the procedure of the present experiments may, however, have some 
shortcomings. The subjects may have misinterpreted the instruction concerning the kind of 
differences between pictures that would appear, since these varied substantially between pairs. If 
so, the subjects would, for example, have been responding ‘alike’ more frequently, if they 
expected the differences to be larger. However, the results indicate that there were no response 
preferences of the kind mentioned in any of the experimental groups. 

Another possible obstacle may have been the somewhat technical and impersonal character of 
the procedure used. Imagery, as a subject of study in a psychological laboratory, may demand a 
certain atmosphere, which encourages imaginative activities among the subjects. Although this 
suggestion relates to most studies of imagery, the treatment of subjects in the Gur & Hilgard 
experiment may have inspired the subjects’ imaginative involvement more than in the present 
study. 

In conclusion, the present experiments used the same stimulus materials as the Gur & Hilgard 
study. The experimental procedure was altered to gain better control of extraneous variables. 
The procedure used may have discouraged the subjects from utilizing their imaginative 
capability. 

A main difference between the present study, on the one hand, and those of Marks and Gur 
& Hilgard, on the other hand, concerns the selection of subjects. The ‘double-blind’ technique 
proposed by Sheehan & Neisser (1969) was not adopted in the last-mentioned experiments. 
Furthermore, the subjects participating in the present experiments had no experience of 
experiments of visual imagery. They were unaware of the purpose of the experiments. The 
order of presentation between the VVIQ and the memory task did not influence performance 
in Expt. II. This indicates that the subjects were also incapable of guessing the purpose, when 
performing the experimental tasks. 

A possible explanation of the failure to predict the subjects’ performances in the memory 
tasks from self-rated vividness of imagery may be the weaknesses of the VVIQ questionnaire. In 
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factor-analytical studies of variables connected mainly with visual imagery, several factors have 
been isolated. Paivio et al. (see Paivio, 1971) extracted four factors, one of which was defined as 
a spatial-ability factor and a second was defined by the subjects' subjective reports or ratings. 
The VVIQ would be related only to the last-mentioned factor. i 

In another context, studies have been performed to investigate the capacity of single words to 
evoke visual images. It has been shown by Paivio (1968) that self-rated vividness of imagery is 


closely correlated with ‘imagery latency’ and ‘rated ease of image arousal’. ‘Until such 
varjables are teased apart we cannot be sure of the conceptual or empirical status of the 
vividness concept in relation to memory research’ (Paivio, 1972, p. 262). 

For these and possibly other reasons, it appears difficult to define the relations between 
self-rated vividness and performance in memory tasks, in which visual imagery may mediate 
successful performance. There seems to be a need for a new questionnaire with greater 
sophistication in relation to the underlying mechanisms of self-rated imagery. 


Acknowledgement 


The authors are grateful for the suggestions and advice which they received from Professor E. R. Hilgard. 


References 
Betts, G. H. (1909). The distribution and functions 


of mental imagery. Columb. Univ. Contrib. Educ. 


26, 1-99. 

Gur, R. C. & HILGARD, E. R. (1975). Visual imag- 
ery and the discrimination of differences between 
altered pictures simultaneously and successively 
presented. Br. J. Psychol. 66, 341-345. 

Marrs, D. F. (1972). Individual differences in the 
vividness of visual imagery and their effect on 
function. In P. W. Sheehan (ed.), The Function 
and Nature of Imagery. New York: Academic 
Press. 

Marrs, D. F. (1973). Visual imagery differences in 
the recall of pictures. Br. J. Psychol. 64, 17-24. 

MER, N. C. (1940). The Meter Art Tests: I. Art 
Judgement. Iowa City, Iowa: Bureau of Educa- 
tional Research and Service. 

NEISSER, U. (1970). Visual imagery as process and 
as experience. In J. S. Antrobus (ed.), Cognition 
and Affect. Boston: Little, Brown. 


PAIVIO, A. (1968). A factor-analytical study of word 


attributes and verbal learning. J. verb. Learn. 
verb. Behav. 7, 41-49. 

Parvio, A. (1971). Imagery and Verbal Processes. 
New York: Holt, Rinehart & Winston. 

Parvio, A. (1972). A theoretical analysis of the 
role of imagery in learning and memory. In 
P. W. Sheehan (ed.), The Function and Nature 
of Imagery. New York: Academic Press. 

RICHARDSON, A. (1972). Voluntary control of the 
memory image. In P. W. Sheehan (ed.), The Func- 
tion and Nature of Imagery. New York: Academic 
Press. 

SHEEHAN, P. W. (1967a). A shortened form of 
Betts’ Questionnaire upon Mental Imagery. J. clin. 
Psychol. 23, 386—389. 

SHEEHAN, P. W. (1967 b). Visual imagery and the 
organizational properties of perceived stimuli. Br. 
J. Psychol. 58, 247-252. 

SHEBHAN, P. W. & NEISSER, U. (1969). Some vari- 
ables affecting the vividness of imagery in recall. 
Br. J. Psychol. 60, 71-80. 


Recelved 4 June 1975; revised version received 23 January 1976 


Requests for reprints should be addressed to Góran H. Berger, Department of Psychology, University of 


Uppsala, Uppsala, Sweden. 
Samuel C. B. Gaunitz is at the same address. 


Br. J. Psychol. (1977), 68, 289-295 Printed in Great Britain 289 


Selective encoding processes in recognition 


Jon Wright, Donald S. Ciccone and John W. Brelstord 


Recent studies using orienting tasks (Hyde & Jenkins, 1969, 1973) suggest that a subject’s encoding 
operations are best characterized as a directed or selective process. Inferences about encoding from these 
experiments, however, are based on quantitative differences in recall performance. It was felt that more 
sensitive tests of the nature of information stored as the result of hypothetical encoding operations were 
necessary. In a recognition paradigm, critical items were presented in the context of meaningful modifiers or 
in the context of nonsense, rhyming modifiers to encourage subjects to selectively encode specified sets of 
stimulus attributes. The effectiveness of the manipulation was evaluated in terms of the false alarm errors to 
either semantically related distractors, or non-semantically related distractors. Despite indications of an 
effective encoding manipulation, no recognition performance differences to unrelated control items were 
obtained. Some implications for the levels of processing theory of Craik & Lockhart (1972) were discussed. 


An information-processing approach to memory requires certain assumptions about the encoding, 
storage, and retrieval of stimulus information. Tulving & Thomson (1973) have summarized one 
set of assumptions by the principle of encoding specificity. According to this principle, the 
nature of stored information is a direct result of the encoding operations performed on the 
nominal stimulus. In turn, the retrieval of information at test depends on the extent to which 
retrieval cues match the information stored during study. Given this link between the encoding 
of an item and its retrieval, differences in encoding operations should be reflected in the retrieval 
performances of subjects under conditions where the retrieval environment is held constant. 

Recently, there have been some attempts to manipulate the nature of encoding processes 
through the use of orienting tasks (Hyde & Jenkins, 1969, 1973). These studies have indicated 
that tasks requiring semantic processing produce greater amounts of recall than tasks requiring 
the processing of non-semantic information. These orienting tasks were designed to control a 
subject’s encoding strategy by requiring him to attend to a specified set of stimulus attributes (cf. 
Ciccone & Brelsford, 1975). For example, a subject might be instructed to rate the pleasantness 
of a given word or determine the number of characters a word contains. Implicit in these 
manipulations is the notion that subjects can selectively encode. When confronted with a task 
requiring non-semantic processing, subjects presumably encode non-semantic information, while 
conversely, with a semantic task they encode semantic information. 

Inferences regarding the selectivity of encoding processes require confirmation that encoding 
manipulations are accomplishing their goals. Hyde & Jenkins (1973) have drawn a theoretical 
distinction between semantic and non-semantic encoding on the basis of quantitative differences 
in recall performance. However, their results can be explained more simply in terms of 
differential semantic encoding. Specifically, a non-semantic task might not induce non-semantic 
encoding at all but instead serve to lower the probability that an item will be semantically 
encoded. The resulting recall differences might then be explained in terms of a single semantic 
dimension. Differences in recall performance, even though substantial and reliable, cannot serve 
as a basis for theoretical inferences regarding selectivity in the encoding of stimulus attributes. 
The recall differences observed in orienting task experiments can be explained either in terms of 
selective non-semantic and semantic encoding or in terms of the encoding of semantic attributes 
along a single dimension. Choosing between these alternatives requires a confirmation of the 
assumed theoretical processes. This can be done by assessing the nature of the resulting trace 
information since it is assumed that stored information reflects the type of encoding operations 
which preceded storage. 
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Tulving & Bower (1974) discuss a variety of ways that have been used to assess the nature of 
such trace information. One effective way is to observe the type of confusion errors subjects 
make in a recognition task. Subjects presumably make false alarms (yes responses to new items) 
on the basis of the number of matched features between the test stimulus and the stored 
information. If a memory trace arising from a certain encoding process contains mostly semantic 
information subjects should exhibit confusability (i.e. make false alarms) to semantically related 
distractor items. On the other hand, if a trace contains mostly non-semantic information, 
confusability to non-semantically related items should be higher. Once the encoding manipulation 
has been confirmed in this way, it becomes possible to investigate the effects of semantic and 
non-semantic encoding strategies on general recognition performance. This might be 
accomplished by inserting neutral control distractors in a test of recognition. Performance on 
these control distractors would be relevant to ascertaining the effect of a particular encoding 
strategy on general recognition performance. For example, Craik & Tulving (1975) have 
presented data suggesting that semantic encoding is superior to non-semantic when given a test 
of recognition. They do not, however, specify the nature of the distractor material which they 
employed. It seems reasonable to expect that the effect of a given encoding strategy might be 
investigated to some extent by the nature of the relationship between old study items and new 
test items (i.e. by the type of distractors employed). 

Generally, attempts to assess trace information through confusion errors have not coincided 
with the use of orienting tasks. The experiment of Elias & Perfetti (1973) is an exception. They 
induced encoding strategies by having subjects generate either synonyms, associates or rhymes 
to critical items for a ten-second period (i.e. there were semantic, associative, and phonemic 
conditions, respectively). Using a recognition task and a distractor technique similar to that 
described above, they obtained distractor confusions for both phonemic and associative orienting 
tasks but not for the semantic orienting task. However, because of the particular orienting task 
they chose, it was possible for subjects to generate some of the actual distractor items used at 
test during the study phase of the experiment. Elias & Perfetti eliminated these items from the 
analysis. Such a procedure is likely to reduce sensitivity in detecting confusion errors because 
the most closely related items are also most likely to be generated during study and thus 
eliminated from the data. Nevertheless the experiment provides some support for the selective 
encoding notions thought to underlie orienting tasks. Additionally, Elias & Perfetti (1973) found 
that recognition for control items (words having no specified relationships to distractors) was 
highest for the semantic task, followed by the associative task and then the phonemic task. 

Ciccone & Brelsford (1975) provided a check on encoding manipulations by including high 
associate distractors in their recognition test. They obtained more confusion errors to the high 
associate distractors in the semantic encoding condition than in the structural (non-semantic) 
encoding condition. However, firm support for selective encoding as a product of orienting tasks 
requires that both non-semantic and semantic confusability be detected. Ciccone & Brelsford did 
not attempt such a test of their orienting tasks. However, they found that recognition 
performance in the case of control items was higher in their ‘non-semantic’ than in their 
semantic condition whereas Elias & Perfetti (1973) reported just the opposite. 

The present experiment represents an attempt to confirm the selective encoding processes 
thought to underlie orienting tasks. In addition, the experiment was designed to shed some light 
on the empirical discrepancy as to whether semantic encoding inhibits or facilitates recognition 
performance in general, and the identification of distractor items in particular. An encoding 
manipulation similar to that of Ciccone & Brelsford (1975) was chosen. Critical items were 
presented in the context of either semantic or non-semantic modifiers to induce the encoding of 
specified stimulus attributes. Meaningful modifiers were used to direct encoding processes 
toward the encoding of semantic attributes, and nonsense, rhyming modifiers were used to direct 
the processing toward non-semantic attributes. 
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Method 
Design 


Encoding strategy was manipulated in a between-groups design. Critical items were presented visually and 
preceded either by a meaningful modifier (semantic condition) or a nonsense, rhyming modifier (non-semantic 
condition). All subjects received recognition tests which included three types of distractors: semantic 
(synonyms of old items), non-semantic (homonyms of old items), and control (unrelated to old items). 


Materials 


The critical items were 60 nouns chosen from the Thorndike & Lorge (1944) norms (A and AA 

frequency). Sixty adjectives were also chosen from that source for use as semantic modifiers. In addition, 60 
nonsense modifiers were constructed so that each would rhyme with one and only one of the critical items. 
Buffer items presented during study and distractor items for the recognition test were also taken from the 
Thorndike & Lorge norms, with frequency kept within a level similar to that for critical items and their 
modifiers. Included among the distractor items were eight synonyms and eight homonyms to critical items. 
Thus, the recognition test consisted of 44 control study items and 44 distractors, eight synonym study items 
and eight distractors, and eight homonym study items and eight distractors. An attempt was made to keep 
other relationships between critical items, buffers, and distractors at a minimum. 


Procedure 


Two lists were constructed which differed only in that different types of modifiers preceded the critical 
items. Examples of critical items and their modifiers are presented in Table 1. Each experimental list 


Table 1. Examples of critical items and their modifiers 


Semantic Non-semantic Critical 
modifier modifier item 
Splendid Salk Talk 
Tired Borse Horse 
Still Mond | Pond 
Little Sard Yard 
Severe Sall Fall 


consisted of alternating study and test blocks of five items each. Study blocks contained either two or three 
critical items plus their modifiers with the remaining space filled by buffer items and their modifiers. Test 
blocks contained either two or three critical items plus distractors. Items were randomized within blocks with 
~ the exception that half the synonym and homonym distractors followed their critical words in the test blocks 
and half preceded them. An old item was one which had been presented during a study block and was 
subsequently tested. A new item was presented during test but had not appeared before anywhere on the list. 
The probability of an item being old or new over all test blocks was 0-5. There were five study blocks and 
five test blocks intervening between the presentation of an item and its subsequent test of recognition (the 
number of items intervening between study and test was between 75 and 84 items for each critical item). 
Three non-critical study blocks and three non-critical test blocks preceded presentation of the critical 
material. 

Stimuli were presented visually on a closed-circuit television display, and their presentation rates were 
controlled by a PDP-8 computer. In front of the television display was a keyboard on which the subjects 
recorded all their responses. All subjects were run individually and received identical instructions. During 
study blocks, subjects were instructed to repeat out loud the items as they appeared on the screen. During 
test trials, subjects were instructed to press yes or no response keys depending on whether or not they felt 
they had seen the item previously. Each subject received a short practice list to ensure that the task was 
clearly understood. For study blocks, modifier and nouns were presented for 1 sec separately and in 
succession at 0-5 sec intervals. Each modifier and noun sequence was preceded by the 1 sec presentation of a 
star. Interpresentation interval between star and items was also 0:5 sec. During test blocks, items were 
presented on the screen for 2 sec. Each study block began with the appearance of the words ‘study trial’ 
and each test block with the words ‘test trial’. 
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Subjects 

The subjects were 20 students in an introductory psychology class at Rice University who participated for 
course credit. They were assigned in a block randomized order to experimental conditions. 

Results 

Each subject’s hit (yes response to old items) and false alarm rates were converted to d’ 
measures through the use of tables provided by Elliot (1964). Mean hit and false alarm rates and 
mean d' values are provided in Table 2. For synonyms and homonyms, both the study item and 


Table 2. Mean hit and false alarm rates and mean d' values as a function of encoding condition 
and distractor type 


Distractor False 
type Hits alarms d' 
Semantic encoding condition 
Controls 072 0-10 211 
Synonyms 0-65 0-30 0-79 
Homonyms 0:78 0-10 2-34 
Non-semantic encoding condition 
Controls 0-76 0-14 1-95 
Synonyms 0:74 0-26 1-47 
Homonyms 073 0-24 1-57 


its related distractor were included in the recognition tests. This allowed the calculation of 
separate d' values for each item type. The key issue in this experiment depended on whether or 
not a subject could discriminate between study items and closely related distractor items. Since 
confusion between such items can be reflected in hit rates as well as false alarm rates, it was felt 
that d' would be a more sensitive measure of confusability. The rationale for using separate hit 
rates to calculate d' assumes that subjects may change criteria for outputting a yes response as 
they proceed through a recognition test. This does not appear to be an unreasonable assumption. 
In fact, hit rates were roughly the same over different item types and encoding conditions, with 
the possible exception of the synonyms in the semantic encoding condition. 

A demonstration of the effectiveness of the encoding manipulation depends largely on whether 
or not semantic and non-semantic confusability can be demonstrated in the corresponding 
conditions. If a semantic encoding manipulation succeeds in producing memory traces which 
contain mostly semantic information, items which are similar in meaning (synonyms) should be 
more readily confusable. For non-semantic encoding manipulations, items which are 
non-semantically related (homonyms) should be more confusable than other types of items. In 
other words there should be an interaction between the encoding condition and distractor type. 

In terms of the d' data, there was a significant effect of distractor type, F= 10-30, d.f. = 2, 36, 
P« 0-001, MS, = 0-46, but no main effect for encoding condition, F= 0-09, d.f. = 1, 18, P» 0-05, 
MS, = 0.92. More importantly, there was a significant interaction between encoding condition 
and distractor type, F= 5-46, d.f. 22, 36, P« 0-01, MS, = 0-46. The finding of a significant 
interaction seems consistent with the notion of selective encoding. 

Individual comparisons indicated that the specific differences causing the interaction were 
consistent with our expectations. For homonyms, there was poorer discrimination in the 
non-semantic condition than in the semantic condition, F= 5-93, d.f. = 1, 18, P< 0-05. On the 
other hand, synonyms were more difficult to discriminate in the semantic condition, F — 4-60, 
d.f. 2 1, 18, P« 0-05. However, for control items, no significant difference was obtained, 
F=0-33, d.f. = 1, 18. The finding of no difference between encoding conditions for control items 
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held true when the analysis was confined to false alarms, F= 0-26, d.f. = 1, 18, MS, = 0-04, and 
hits, F= 0-34, d.f. = 1, 18, MS. = 0-02. 

In addition, since the number of false alarms to control distractors could not reasonably be 
below the number of false alarms to semantic or non-semantic distractors (items which are more 
similar cannot be less difficult to discriminate), some directional tests were performed to 
determine the extent of the actual confusion errors. A similar criterion for the use of directional 
hypotheses has been suggested by Kimmel (1957). These tests consisted of comparisons within 
encoding conditions to determine if the number of false alarms to either synonym or homonym 
distractors was greater than that for the control distractors in the same encoding condition. We 
would expect subjects to make more false alarms to those items which are similar to the kind of 
information which was encoded during study. By this reasoning more false alarms to synonym 
distractors indicate the encoding of semantic information during study and more false alarms to 
homonym distractors indicate non-semantic encoding during study. 

In the semantic encoding condition, the comparison between control distractors and synonym 
distractors was significant, F= 15-58, d.f. = 1, 36, P< 0-001, MS, — 0-013, while the false alarm 
rates for control distractors and homonym distractors were virtually identical. In the non- 
semantic encoding condition, the comparison between the control distractors and synonym 
distractors was significant. F= 4-82, d.f. = 1, 36, P< 0-05, as was the comparison between 


control distractors and homonym distractors, F= 3-22, d.f. = 1, 36, P< 0:05. Thus, there was 


appreciable semantic information encoded in the semantic condition as indicated by the 
confusion errors to synonym distractors. In the non-semantic condition, both semantic and 


non-semantic information was encoded as indicated by the confusion errors to synonym and 


homonym distractors. 


Discussion and conclusions 


The results give some support to the notion that subjects can attend to and selectively encode 
stimulus attributes associated with a word. In terms of the d' measure there were clear 
differences between the semantic and non-semantic encoding conditions. Discrimination was 
more difficult for synonyms in the semantic encoding condition and more difficult for homonyms 
in the non-semantic condition. These data offer strong support for our ability to manipulate the 
encoding process. It is thus clear that a subject's encoding processes can be influenced by 
relatively simple manipulations. 

Despite the evidence for an effective encoding manipulation in terms of the type of 
information encoded, there was no difference in recognition performance to control items. This 
was contrary to other reported studies (Elias & Perfetti, 1973; Ciccone & Brelsford, 1975). One 
point made clear by the present study is that recognition performance depends to a large extent 
on the relationship of distractor items to critical items. Both non-semantic and semantic 
similarity between critical items and distractors can reduce recognition performance depending 
on the nature of prior encoding operations. This means that distractor items should be carefully 
controlled if one is to have an unconfounded measure of retention. Some discrepancies in the 
literature, in fact, may be understood on the basis of this variable. 

Ciccone & Brelsford (1975) reported that structural (i.e. non-semantic) encoding was superior 
to semantic encoding in recognition. An examination of the stimuli employed in that experiment 
suggests that distractor selection may have been the cause of that finding. The choice of 
distractors having semantic or associative relationships to critical items makes the detection 
problem in recognition more difficult. The present study indicates that semantic similarity 
between critical and distractor items will interact with the encoding process lowering 
performance more in the semantic encoding condition than in the non-semantic condition. A 
comparison of the false alarm rates in the semantic encoding conditions for the two experiments 
revealed that the rates were over twice as high in the Ciccone & Brelsford study as in the 
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present study (20-2 versus 9-7 per cent). The false alarm rates in the non-semantic conditions 
were essentially the same (17-6 versus 14-3 percent). 

Some aspects of the present experiment-have implications for the levels of processing theory 
of Craik & Lockhart (1972). According to that theory, the encoding operations performed on a 
stimulus take place in a hierarchical manner. Orthographic and phonemic features are encoded 
first, followed by the analysis of semantic information. At the deepest levels of processing, 
attempts are made to relate the incoming stimulus information to other information stored in 
memory. In terms of this processing hierarchy, semantic encoding follows non-semantic 
encoding. When the deeper levels of processing have been reached, information from earlier 
processing tends to be lost from the store. Thus, semantic information is assumed to be more 
persistent than other types of information. Retention is thought to be a function of the depth to 
which an item has been processed. 

The present experiment is relevant to three assumptions of the levels of processing approach. 
(1) Retention varies directly with level of processing. The presence of confusion errors to 
homonyms in the non-semantic encoding condition suggests that the encoding operations under 
that condition did not proceed as far in the processing hierarchy as for the semantic encoding 
condition. Following the most direct interpretation of levels of processing theory, there should 
have been clear effects of encoding operations on recognition performance to control items. Yet 
no such effects were obtained. However, other studies have indicated that recognition 
performance varies with level of processing (Schulman, 1971; Elias & Perfetti, 1973; Craik & 
Tulving, 1975). These studies have employed orienting tasks using incidental learning paradigms, 
while the present study used an intentional paradigm. The main issue, however, is not 
differences between incidental and intentional learning paradigms but whether or not level of 
processing can be defined in terms of the kind of information that is encoded. If validity checks 
on encoding manipulations can be made, then the predictions of levels of processing theory 
ought to hold independent of learning paradigms. 

Craik & Tulving (1975) present data which indicate that during study trials non-semantic 
orienting tasks generate consistently shorter latencies than semantic tasks. Thus non-semantically 
encoded items probably receive less processing under incidental tasks. This may not be true in 
intentional paradigms, when subjects presumably attempt to store as much information as 
possible about each study item, even though the encoding process may be biased by the 
orienting task. Such data do not exclude levels of processing theory since the superficial analysis 
of stimuli should take less time. However, they suggest an alternative explanation - that the 
lowered recognition performance following non-semantic orienting tasks is due to a smaller 
amount of encoded information in such tasks, and not to the qualitative nature of the encoded 
information (non-semantic versus semantic). 

(2) Trace persistence is a function of level of processing. The interval between study and test in 
the present study was nearly five minutes. The fact that non-semantic confusions were sustained 
through this period indicates that non-semantic information was present and providing a basis for 
response decisions after a long-term delay. Elias & Perfetti (1973) obtained phonemic confusions 
after nearly eight minutes of intervening activity. This suggests that under the appropriate 
conditions non-semantic traces can be as persistent as semantic traces. However, neither 
experiment was designed to determine the extent of the relative persistence. Longer intervals 
may show that semantic information is more durable as levels of processing theory predicts. 

(3) Encoding processes occur in a hierarchical manner, processing first non-semantic 
information and then semantic information. In the present study, there was evidence for 
confusion errors to both synonym and homonym distractors in the non-semantic encoding 
condition. Evidently subjects were making use of both non-semantic and semantic information in 
that condition. This was not true in the semantic encoding condition where subjects tended only 
to make confusion errors to synonyms. This finding is not consistent with a hierarchical 


Selective encoding processes 295 


conception of encoding processes. In the hierarchy, non-semantic encoding is assumed to occur 
before semantic encoding. Thus the detected presence of semantic information presupposes that 
the related non-semantic information has been encoded also. Non-semantic confusions should 
therefore be detected in any conditions where semantic confusions are detected. The present 
study does not support this idea. 

However, alternative interpretations can be made which preserve the notion of a hierarchical 
encoding process. First, it might be the case that non-semantic information is lost more rapidly 
than semantic information. Thus, the presence of semantic information might not always indicate 
that non-semantic information was available in the store. Craik & Lockhart (1972) make such a 
suggestion. However, this interpretation requires the additional assumption that non-semantic 
information is lost more rapidly after a semantic encoding manipulation than after a non- 
semantic manipulation, since this finding was obtained in the present study. Secondly, confusion 

„errors may be more associated with the detection process rather than the presence or absence of 
a particular kind of information. Semantic encoding manipulations may produce highly specific 
semantic information which clearly distinguishes distractors from critical items. However, such 
highly specific semantic information should tend to reduce the number of confusion errors to all 
types of distractor items. Such a finding did not occur. 

Perhaps the most parsimonious interpretation of the present data is that encoding processes 
are not hierarchical, but rather may be directed by the subject or the experimenter depending on 
the experimental circumstances. It seems reasonable that encoding processes might be altered by 


the subject to suit the demand characteristics of specific orienting tasks. If such were the case, 
subjects could direct their encoding efforts toward the unique features of stimuli required for 


efficient performance in a given task. 
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A developmental study of the effects of instructions on visual shape 
judgements 


Jeffrey Field and John K. Collins 





Previous developmental studies of visual shape judgements have failed to instruct subjects clearly as to the 
type of shape judgements required. Using the method of adjustment 180 subjects in five age groups, 6, 8 10, 
12 and 19 years, were instructed to judge either the real or projected shape of an elliptical standard slanted 
39, 57 and 72 degrees from the fronto-parallel. No age changes were found in the tendency towards shape 
constancy. Objective shape judgements exhibited a significantly greater degree of constancy than projective 
judgements and there was no decline in constancy with increasing slant of the standard under objective 
instructions, in contrast to a marked decline under projective instructions. Female subjects consistently 
manifested a greater tendency towards shape constancy than males. 





Shape constancy refers to the relative stability of the judgements of the shape of an object 

in spite of wide variations in its retinal projection. Previous developmental studies of visual 
shape judgements have not recognized the possible effects of instructions on shape constancy 
(Thouless, 1932; Klimpfinger, 1933; Meneghini & Leibowitz, 1967). The aim of the present 

study was to examine age changes in visual shape judgements made under objective instructions, 
which required the subject to judge the real shape of an object, and projective instructions, 
which required a judgement in terms of the retinal image of the object. The differential 

effects of these two types of instructions on shape judgements have been well established only 
in the case of adult subjects (Landauer, 1964a; Lichte & Borreson, 1967). 

While the findings of Bower (1966) on shape constancy in infants strongly suggest that 
constancy is either unlearned or rapidly learned, this does not preclude the possibility of 
subsequent age changes in constancy. On the basis of the results of a study by Klimpfinger 
(1933), Brunswik (1956) claimed that there was an increase in the degree of shape constancy up 
to at least ten years, while Piaget (1969) postulated a similar trend in regard to size constancy. 
Piaget (1969) also suggested that projective judgements may be more accurate in children eight 
years or younger than in adults. However, Koffka (1935) and Gibson (1969) maintained that 
reported age changes in shape constancy may be more the result of methodological differences 
between studies than of any basic change in size and shape perception, although Gibson (1969) 
also specifically predicted an increase in the accuracy of projective space judgements with age. 

Phenomenal instructions requiring a judgement of the immediate experience of object shape 
(Thouless, 1931; Osgood, 1953; Vernon, 1970) were excluded from the present study for several 
reasons. First, since phenomenal shape cannot be operationally defined, the range of possible 
judgements that subjects can make is not clearly delimited and, as Sutcliffe (1972) has noted, 
clear specification of the performance required from a subject is a prerequisite for effective 
instructions. Second, there has commonly been a failure to find a significant difference between 
mean phenomenal and projective shape judgements of adults (Thouless, 1931; Epstein, Bontrager 
& Park, 1962; Lichte & Borresen, 1967) and there has only been one convincing demonstration 
of the capability of adult subjects to make phenomenal shape judgements uniquely in relation to 
objective and projective judgements (Landauer, 1964 b). The distinction between phenomenal 
and objective or projective shape does not seem possible in young children. Rapoport (1967) 
found that children below ten years did not give different size judgements to phenomenal and 
objective instructions and Vurpillot (1964) reported a similar lack of distinction in the shape 
judgements of children below 12 years. Third, the likelihood that age changes in understanding 
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and interpretation may accompany the use of phenomenal instructions is suggested partly by the 
fact that under phenomenal instructions less intelligent subjects tend to make objective shape 
judgements, while highly intelligent subjects tend towards projective judgements (Leibowitz, 
Waskow, Loeffler & Glaser, 1959; Leibowitz & Sacca, 1971). 

A problem associated with the use of the same verbal instructions for young children and 
adults alike is that any age trend observed may be contaminated with age changes in the 
understanding of instructions. Attempts must be made, therefore, to overcome this difficulty by 
eliminating as far as possible any vagueness or ambiguity in the objective and projectiye 
instructions employed. By means of practical demonstrations, elaborations and repetitions, a 
high degree of ‘recallability’ (Gagne, 1965) of the instructions should be established in subjects 
of all ages before examination. 

One previous developmental study by Vurpillot (1964) did make use of varied instructions in 
shape judgements but serious drawbacks in methodology greatly limit the generality of her 
results. The order of presentation of instructions was always the same with phenomenal 
instructions preceding objective instructions; under one instruction condition comparison stimuli 
were presented simultaneously, under another successively; and no allowance was made in the 
comparison series for overconstancy. 

Interpretation of experimental findings concerning shape constancy often reveal a lack of 
appreciation of the complex interdependence of subject variables, instructions, observation 
conditions and aspects of judgement methodology affecting the degree of constancy observed. 
For example, very little consideration has been given to the nature of the interaction between the 
slant of the standard object and the type of instructions used. 

An examination of shape constancy studies, in which the slant of the standard was varied, 
revealed that when either phenomenal or projective judgements were required the degree of 
constancy reliably decreased with increasing object slant from the fronto-parallel: plane 
(Leibowitz et al. 1959; Landauer, 19644; Vurpillot, 1964; Leibowitz & Meneghini, 1966; 
Meneghini & Leibowitz, 1967). With objective instructions Landauer (1964 a) found no significant 
change in constancy over increasing object slant. There are much less data available on the 
relation of objective and projective judgements to changes in slant than for the case of 
phenomenal judgements and even less is known about the interaction between instructions and 
object slant in children. 

In the following experiment an attempt was made to examine age trends in a shape constancy 
under instructions designed to ensure maximum effectiveness in delimiting the range of 
responses, to provide more data on the relationship between the type of instructions and the 
angle of slant of the standard object, and to test for possible sex differences in shape constancy 
judgements. 


Method 
Subjects 


There were 180 subjects in five equal groups of 6, 8, 10, 12 and 19 years of age. The younger groups were 
drawn from public schools, the last group were student volunteers. There were equal numbers of males and 
females in each group and all had at least corrected 6/6 vision. Details are shown in Table 1. 


Apparatus 


The standard shape was an ellipse with a vertical axis of 12 cm and a horizontal axis of 9 cm which was cut 
from 1:5 mm steel and painted with yellow fluorescent paint. It was attached to a round black pointer in the 
middle of a viewing box 91 cm high, 74 cm wide, and 45 cm deep. The centre of the standard was 117 cm 
above the floor and was viewed through a 50 cm square black cardboard aperture against a 2:5 cm square 
black and white checkerboard background. The rest of the interior of the box was matte black. Between 
judgements the standard was hidden by a black curtain which the experimenter could operate by drawstrings 
from the back of the box. Illumination was provided by a 100 Watt incandescent lamp 45 cm above the 
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Table 1. Age means, standard deviations and ranges in years of the 18 males and 18 females in 
each of the five age groups 


Males Females 
Age _ 
group Mean S.D. Range Mean S.D. Range 
6 59 0:31 5:6-6-5 59 0-28 55-64 
8 8-0 0:30 7-5-8:5 8-0 0-32 7.5-8:5 
10 10-0 0:24 9-7-10-4 10-0 0-23 9-5-10-5 
12 12-0 0:21 11-7-12-4 12-1 0-25 11-7-12-4 
19 19-1 0-87 18-1-20-8 18-8 0-53 18:1-20:3 


standard, which in turn could be rotated in a counterclockwise direction about its vertical axis to a 
predetermined angle by turning a knob at the back of the viewing box. The viewing box distance of both the 
standard and variable shapes was 170 cm. 

The variable shape was a sharply defined patch of light projected onto a 30 cm square opal acrylic screen 
through a rotating elliptical aperture with a vertical axis of 7-6 cm and a horizontal axis of 15-2 cm. This 
method of projection resulted in a variable shape with a constant vertical axis of 8 cm and a horizontal axis 
that could be continuously varied from zero (no patch of light) to 16 cm. Standard and variable shapes of a 
different size were used in order to compel subjects to judge in terms of shape rather than size. 

The subjects made their judgements by turning a wheel to alter the horizontal diameter of the variable 
shape. The wheel, which was positioned within easy reach in front of the subject, was connected by a 
160 cm long steel rod through a 4 to 1 ratio contrate gear to the elliptical aperture behind the variable shape 
display screen. The angle in degrees of the subject's adjustment of the variable aperture could be read by the 
experimenter from a pointer over a fixed protractor at the rear of the display box to the nearest 0-25°. At a 
variable aperture setting of 60° the shape projected was a circle of diameter 8 cm. By turning a knob 
associated with the pointer at the rear of the apparatus, the experimenter could set the variable aperture at 0° 
or 90° for descending and ascending trials respectively. The 0° and 90° settings corresponded to 16 cm and 
zero widths of the variable ellipse. The eye level of subjects was held constant at the height of the standard 
and variable stimuli by means of an adjustable chair. The distance between the centres of the standard and 
variable shapes subtended an angle of 18°. 

There were a number of pretraining stimuli. The first two were designed to avoid any size and shape 
confusions in subject's judgements. The first series comprised 12 geometrical forms of the same physical 
area including two rectangles and two rhombi that were the same shape and size. The forms were presented 
simultaneously in a random array on a 51x63 cm white cardboard sheet. The second series comprised 12 
forms that were all different sizes but had two triangles and two trapezia of the same shape. Two other 
pretraining stimuli consisted of a blaek ellipse with axes of 15 and 10 cm on a white cardboard sheet which 
was used with its major axis vertical then horizontal to eliminate possible orientation confusion and another 
black ellipse with a vertical axis of 15-2 cm and a horizontal axis of 7-6 cm which was used to check 'the 
subjects’ understanding of the judgement procedure. 

Three other forms were used during the instructions phase of the procedure to demonstrate the 
characteristics of objective and projective shape. They were a 15x10 cm ellipse, with major axis horizontal, 
and a 10 cm diameter circle both of which were black cardboard cut-outs attached to thin rods so that they 
could be held upright by the experimenter and a demonstration ellipse 15x10 cm made in the form of the 
standard shape. This ellipse was placed in the standard box and slanted 48? from the subject's fronto-parallel 
plane to form a circular retinal projection. 


Procedure 


Testing was carried out under natural conditions. There were no physical restrictions on head movement and 
viewing was binocular. During testing a 20 Watt fluorescent lamp above the standard shape apparatus 
provided general illumination of the laboratory. 

The subjects were tested individually, with total testing time varying from 20 min for the youngest to an 
average of 10 min for the oldest age group. The 18 males and 18 females in each group were randomly 
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allocated to either the objective or projective instruction conditions, with the restriction of equal numbers 
per instruction group.,Each subject made nine test judgements, three at each of the angles of slant of the 
standard from the fronto-parallel. These slants were 39°, 57° and 72°, representing approximately 0-23 
interval steps for the cosine of the angle of rotation of the standard. The order of presentation of the three 
slants and the direction of the variable shape adjustment (ascending or descending) were counterbalanced 
both within and between subjects. 

During the judgements the experimenter stood out of sight of the subject behind the standard viewing box. 
After each judgement the standard was obscured with the black curtain and the experimenter reset the 
variable aperture for either an ascending or descending judgement and changed the slant of the standard. The 
possible involvement of kinaesthetic cues was minimized by the use of ascending and descending judgements 
and the fact that the aperture could be set at variable starting points beyond the 0° and 90° slants. After 
every three judgements the experimenter removed the standard from its holder and pretended to replace it 
with another shape. His hand movements were hidden by the curtain but the subject could hear the 
imaginary changeover. This action was designed to prevent the possibility that the subject would begin 
making his adjustments from memory after the first few trials. That the deception succeeded was very clear 
from subjects’ replies to a post-experimental inquiry. 

In the pretraining phase of the experiment all subjects were shown the first two pretraining series with the 
same size then different size geometrical forms and each time they were asked to point out two things of the 
‘same shape’. Next familiarization of each subject with the operation of the variable shape apparatus was 
gained by the use of the black 15x10 cm ellipse on the white cardboard sheet. At this stage an attempt was 
made to eliminate possible orientation confusions by showing that two different shapes could be formed 
when the major axis was first vertical and then horizontal. Subsequently, each subject was required to adjust 
the variable patch of light to match the 15-2x7.6 cm ellipse in order to check task comprehension. 

The subjects were given either objective or projective instructions according to their grouping. The 
objective instructions were: ‘When I turn a shape away from you, it stays the same shape (demonstrated 
with 15x10 cm ellipse). It is still the same shape because I can turn it back straight on to you like this. In the 
box here, I have another shape that is just the same as this one. Although it is turned away it is still the 
same shape because I can put the first one over it like this. See? And to show you that it is really the same 
shape I can take it out of the box and put it over the other one like this. So although this one might be 
turned away from you it is still the same shape as the other one. Now I am going to show you some more 
yellow shapes in this box and I want you to make on the screen there what you think is really the same 
shape as the one I show you in the box. So that if you took the one out of the box and held it in your own 
hands it would really be the same shape as the one you make on the screen. Now when you think you have 
made really the same shape as the one I am going to show you will you take your hand off the wheel and say 
oe right , ? , 

The projective instructions were accompanied by a practical demonstration of the projection of shadows 
on flat backgrounds. In the first part of the demonstration the light of a socket torch was first directed at the 
' 15x10 cm elliptical cut-out, in the darkened laboratory, to show that its shadow shape, which was projected 
onto white cardboard, changed as the experimenter turned it away from the subject’s fronto-parallel plane. 
In the second part of the demonstration, the torch light was directed at the ellipse in the standard viewing 
box from a position near the subject's eyes and a circular shadow was projected on the checkerboard 
background. The circular demonstration stimulus, 10 cm in diameter was also placed in front of the slanted 
stimulus in the box to show its circular projection. The complete projective instructions were: ‘When I turn 
a shape away from you it looks different. It gets thinner and thinner the more I turn it away, just like its 
shadow shape. Can you see its shadow shape? Right! In the box here I have a yellow shape that happens to 
have a shadow shape just like a circle. See if I take a circle it fits right over it. (Experimenter placed the 
circle across the front of the slanted ellipse in the fronto-parallel plane.) Can you see that? Well, I can show 
you in another way that this shape happens to have a shadow shape like a circle if I take a torch 
(experimenter switched off the light in the box) and shine it at the shape, from near you. See how the 
shadow shape looks like a circle? Well I am going to show you some more yellow shapes in this box and I 
want you to make on the screen there what you think is the shadow shape of the one I show you in the box. 
Now when you think you have made what looks like the shadow shape of the one in the box will you take 
your hand off the wheel and say “‘right’’?’ 
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Results 


The raw data obtained from each subject consisted of three angular settings of the variable 
aperture for each of the three slants of the standard stimulus. The means of the subject's three 
adjustments at the different slants were taken as his judgement. Each of the three mean angular 
settings was converted into its cosine function and multiplied by a constant 24. The transformed 
value represented the judged horizontal axis width of the standard relative to the mean adjusted 
width of the variable patch of light, for a particular slant of the standard. The transformation 
enabled a more meaningful interpretation of the results in terms of degree of shape constancy, 
than would be possible with findings reported in the form of angular settings of the variable 
aperture (Borresen & Lichte, 1962). 

The means and standard deviations of the transformed shape judgements for all 180 subjects 
are shown in Table 2. Each mean represents the judgements of nine subjects. It can be seen that 
the mean judgements of the objective instruction group undergo only a slight decline over the 
increasing degree of stimulus slant, while the decline of the projective instruction group is more 
marked, especially at the 72° slant. 


Table 2. Means and standard deviations of the shape judgements (cm) of subjects in each of the 
five age groups tested 





Objective instructions Projective instructions 

Male Female Male Female 
Age $$ 
group iy (5p 7D aw SP 7p 3o sp T cov ST Tw 





6 M 9.12 9:02 8-35 9-4 9:19 897 854 7:84 622 1170 9:17 6-88 
sD 225 2:27 223 2-03 293 3-39 256 213 3-05 2:097 4:14 426 
8 M 847. 801 760 1017 2962 914 853 £698 475 9-0 727 478 
S.D. 1:63 217 149 3-13 340 343 126 158 191 1:70 171 2-02 
10 M 9.3] 857 785 9-0 8-60 816 936 679 422 975 757 4-90 
SD. 190 175 213 0-66 0:90 181 185 095 0-88 1:57 1:80 2-50 
12 M 9.53. 9-11 797 946 932 965 963 758 498 10-52 8-56 539 
s.D. 120 139 239 0:72 121 125 170 198 121 260 200 275 
19 M 9:58 9-77 9-51 1035 10-02 10-71 928 680 407 1039 831 5-46 
SD. 117 1-71 2-08 0.92 144 1-49 133 175 127 154 242 286 





Note. A value of 9-00 corresponded to the objective shape. Values of 6-99, 4-90 and 2-78 corresponded to 
the projective shape at 39°, 57° and 72? slant respectively. 


The data were analysed by a four-factor analysis of variance. No significant effects were found 
among the age groups, F= 1:91, d.f. = 4,160, P> 0-05, but a significant effect, the consistency of 
which can be seen in Table 2, was found between the sexes, F= 8-38, d.f. = 1,160, P< 0-01. 
Both the main effects of instructions, F= 33-05, d.f. = 1,160, P < 0-001, and angle of slant 
F= 196-94, d.f. = 2,320, P< 0-001, were significant as was also the interaction between 
instructions and angle of slant, F= 113-09, d.f. = 2,320, P< 0-001. 

Mean comparisons were carried out on the interaction effect using a per contrast error rate of 
0-02. No significant differences were found in the objective shape judgements over the increasing 
degrees of slant, but the mean judgements at each angle of slant for the projective instruction 
condition were all significantly different from each other. Tests of the differences between the 
means of the two instruction conditions at each angle of slant showed a significant difference at a 
slant of 72°. This revealed that there are significantly higher degrees of constancy in objective 
shape judgements when compared with projective judgements only at the greater angles of slant. 
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Discussion 


The present data, which reflect no significant changes in mean shape judgements from age 6 to 
19 years, do not accord with the results of previous developmental studies of shape matching 
(Klimpfinger, 1933; Vurpillot, 1964; Meneghini & Leibowitz, 1967; Kaess, 1971). Although there 
are important differences yet to be discussed in methodology between the present study and past 
experiments these data clearly do not support the empiricists’ interpretation of the development 
of shape constancy. When the observer’s judgemental task can be operationally defined and 
when the viewing conditions are optimal it seems that age changes are not likely to occur after 
age six years. 

The superior projective size estimates of eight year olds as compared with adults, found by 
Piaget & Lambercier (1951) do not seem to have generality for shape judgements. Neither was 
there any support in the present investigation for Gibson’s (1969) prediction that there would be 
an increase in the accuracy of projective shape judgements over age. It would be useful to 
discover when children can begin to make judgements on the basis of projective shape or size. 

The present study has provided direct evidence of the ability of six year olds to make 
projective shape judgements. This seems to be the youngest age group yet tested on a projective 
judgement task. 

Both absence of an age trend in mean constancy and the differential response of the youngest 
children to the objective and projective instructions seem to be mainly a function of the 
effectiveness of the instruction procedures which were used. Great care was taken to ensure 
that any vagueness was eliminated from the instructions and by the use of repetition and 
demonstration it was hoped that age differences in comprehension would be precluded. 
Parenthetically, the lack of an age trend was not the result of changes in administration of 
instructions over age. Care was taken for example, to maintain the same rate of instruction 
presentation to young and old subjects alike. 

The careful formulation of the instructions represents probably the most important difference 
between the present study and the previous literature on shape constancy development, apart 
from the fact that most of the earlier studies have employed phenomenal instructions. The 
probability of obtaining phenomenal judgements in young children that differ from, 
especially, projective judgements seems very small indeed, since this subtle distinction does 
not seem even to be appreciated by many adults (Landauer, 1964 b). 

Makino (1965) in reviewing developmental studies on shape judgements in Japan noted that no 
age changes were found using the method of adjustment and he argued that such changes were 
less likely in judgemental situations allowing apprehension by the subject of the dimensional 
range of the variable stimuli being employed. Such an interpretation might be applied to the 
present results. However, the study of Meneghini & Leibowitz (1967), which also used the 
method of adjustment but phenomenal instructions, revealed a decrease in constancy with age. 
This suggests the critical importance of instructions to the subject as a determinant of age 
trends, rather than the type of judgemental method employed. 

There are a number of other methodological differences worth noting between the present 
study and others. First, an elliptical standard was preferred in the present case because it 
facilitated the demonstration of the concept of projective shape. There has been little uniformity 
in the use of stimuli across studies. Previous developmental studies of shape constancy have 
used rectangles (Kaess, 1971), ellipses (Klimpfinger, 1933), circles (Meneghini & Leibowitz, 
1967), and nonsense shapes (Vurpillot, 1964). Second, a checkerboard background in the viewing 
box was included to enhance stimulus information for slant (Landauer, 1964 a). The viewing 
conditions, therefore, were less reduced than those of other studies. These 
methodological differences may have contributed to the finding that no significant increase in 
constancy occurred with age which is contrary to the observations of Kaess (1971). 
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The shape matches of subjects in the objective instruction condition revealed a tendency 
towards overconstancy at slants of 39 and 57 degrees. Evidence of overconstancy has not been a 
common finding in shape constancy studies. Wohlwill (1963) has suggested that this is because 
of the rather impoverished viewing conditions and also because most experimenters have used 
phenomenal instructions. A demand for objective shape judgements resulted in overconstancy in 
the case of the adults in Landauer’s (1969) study and in the subjects older than nine years in the 
experiment of Kaess (1971). The finding of no significant changes in the tendency towards 
constancy of objective shape judgements with increasing slant conforms with results obtained 
using adults by Joynson & Newson (1962) and Landauer (1964 a). This result also supports the 
general conclusion that high degrees of shape constancy will be observed under relatively normal 
viewing conditions with instructions clearly demanding objective judgements. Using projective 
instructions Landauer (1964a, b; 1969) also found significantly lower degrees of constancy for 
projective instructions than for objective instructions, particularly with greater degrees of slant. 
The general finding that projective shape judgements are very inaccurate under unreduced 
viewing conditions (Lichte & Borresen, 1967) was also confirmed and lends support to the 
assertion that the tendency towards shape constancy cannot easily be overcome by the subject. 

The consistently higher degrees of constancy in the objective and projective shape judgements 
of females compared with males are interesting as research on sex differences in perception and 
perceptual development is extremely limited. Witkin et al. (1962) offered an indirect explanation 
for sex differences in shape judgements. From their claim that females are more field dependent 
than males it culd be inferred that greater constancy would be reflected in female judgements. In 
many shape constancy studies the sex of subjects has not even been noted (Borresen & Lichte, 
1962; Epstein, 1962; Leibowitz & Meneghini, 1966). It now appears that the possible role of sex 
differences in space perception has not been adequately explored and that their nature in visual 


shape judgements needs closer examination. 
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Evidence that focal processing involves a build-up of a visual object 


‘Robert T. Solman 





Subjects were required to make one of three judgements when presented with a tachistoscopically displayed 
letter, and the PEST procedure was used to obtain point estimates of the stimulus exposure time required to 
yield a 50 per cent (corrected for chance) level of accuracy. The judgements were ordered in terms of the 
information required, and on the assumption that pattern recognition involves the build-up of an appropriate 
visual object (Neisser, 1967), it was predicted that as the visual representation required to make a judgement 
increased in detail, the exposure time required would also increase. The results showed that stimulus 
exposure time increased as subjects located at one of four positions, classified as angular or circular, and 
identified a letter from the set F,T,O,Q. The type of information required to make these judgements, 
changed from the figure-ground discrimination needed for location, to the detailed level of representation 
needed for identification. Therefore, the results supported the prediction and were taken as evidence that 
pattern recognition involves the build-up of a visual object. The judgements location, classification, and 
identification were also made on a letter from the set F,f,Q,q, but in this case the classification judgement 
required subjects to name the letter independent of its case. The results showed that naming was just as 
difficult as identifying and this suggested that the level of representation required to name the letter was the 
same as that required to identify it. 





In his discussion of focal attention and figural synthesis, Neisser emphasized the active, 
constructive nature of the processing carried out by the focal mechanism: *. . .it is important to 
think of focal attention as a constructive, synthetic activity rather than as purely analytic. One 
does not simply examine the input and make a decision, one builds an appropriate visual object’ 
(Neisser, 1967, p. 94). While dynamic views of pattern recognition are well established in 
perceptual psychology (cf. Sutherland, 1959; Uhr, 1966; Gibson, 1969; Dodwell, 1970; Sutcliffe, 
1971), they have had little influence in the information-processing literature. For example, with 
exceptions such as Flavell & Draguns (1957), Smith (1957) and Neisser (1967), most investigators 
have tended to treat the preception of simple familiar patterns as an ‘all-or-none’ process (e.g. 
Sperling, 1960; Sternberg, 1967; Egeth, Jonides & Wall, 1972). However, the notion of a 
dynamic and multistage pattern-recognition process is of direct relevance to a number of issues 
investigated in the information-processing literature, and there is some need to show that this 
process involves the development of a structural representation of the pattern. [For example, 
investigations which suggest that location information and form information are processed 
independently along separate channels (Logan, 1975), may merely demonstrate that the 
distinction between figure and ground (all that is necessary to locate the pattern) is an initial 
stage in the build-up of a visual representation.] 

In the early part of this century the tachistoscope was seen as restricting processing so that it 
could be used to examine the sequencing of early stages of perception. For example, Wever 
(1927) used a mask and a brief display to show that ‘smudges’ or ‘blotches’ preceded the 
development of a ‘vague shape’ which subjects were willing to call figure on ground. However, 
this approach to the problem, and this interpretation of tachistoscopic perception has largely 
been abandoned in favour of reaction time studies in modern work. As discussed in a previous 
paper (Solman, 1975), when reaction time is used as a dependent measure, there is no control 
over how much processing is carried out by the subject. Consequently, Wever's approach was 
adopted in the present study. 

If the perception of a familiar pattern depends on a gradual construction of a visual 
representation, then by limiting the time available for processing and 'tapping' the information 
present in the system, evidence of early levels of representation might be found. That is, once 
the representation of the pattern can no longer be completed any pattern recognition must be 


306 Robert T. Solman 


based on levels of representation achieved early in the process, and errors will reveal 
information about these early levels. Alternatively, instead of examining the type of errors made 
we may vary the judgement that the subject is required to make. The types of required 
judgement may be ordered in terms of the specificity of the information required. In the ` 
experiment to be reported subjects were presented with a single letter from the set F,T,O,Q. 
They were required (a) to locate it at one of four possible positions, (b) to classify it as angular 
or circular, and (c) to identify it. These judgements require the build-up of progressively more 
detailed levels of representation, ranging from the figure-ground information needed to locate 
the letter to the detailed structural representation necessary for its identification. By determining 
the stimulus exposure time needed to make each of these judgements we may estimate the 
processing time needed to build-up the particular level of representation. 

To this point in the discussion it has been assumed that classification of a stimulus follows the 
initial discrimination of structural attributes. But, there are a number of studies which can be 
taken to suggest that non-structural attributes may be discriminated at the same time as the 
structural attributes. For example, studies by Brand (1971), and Ingling (1972), suggest that 
alpha-numeric characters need not be identified before they can be classified as letters or digits, 
and a study by Henderson (1973), suggests that information required to group letters having the 
same name but different case, may be available at the pre-attentive stage of analysis. The results 
of these studies imply either that (a) non-structural characteristics may somehow be extracted 
without prior identification of object structure (which is unlikely), and/or (b) that identification 
may take place at the first stage of analysis during feature processing. The first of these 
possibilities was also examined in the experiment to be reported in the present paper. That is, 
subjects were required to make three types of judgement (location, classification and 
identification) of a letter selected from the set F f, Q, q. The name of the letter only is required 
for classification. If this judgement depends on the same level of representation as that needed 
for identification, the estimates of processing time should not differ for the two. On the other 
hand, the estimate of processing time for the shape judgement (angular vs. circular) should be 
shorter than estimate for the name judgement. 


Method 


The PEST (parameter estimation by sequential testing) procedure was used to estimate directly the time 
under each condition at which the subject had a pre-specified probability of successfully making the 
judgement required (Taylor & Creelman, 1967; Pollack, 1968). There were six experimental conditions 
generated by the two sets of letters (F, T, O, Q, and F, f, Q, q) and the three judgemental tasks (location, 
classification and identification). Throughout the experiment a trial consisted of the following steps. After 
placing a stimulus card in the tachistoscope the experimenter gave a verbal ‘ready’ signal upon which the 
subject fixated the cross, and pressed a hand switch which removed the cross and initiated the display. The 
display remained for a pre-set period and was replaced by a 50 msec mask. The subject responded verbally 
and the experimenter recorded the response, and told the subject whether it was correct or incorrect. 


The PEST procedure 


Parameter estimation by sequential testing (Taylor & Creelman, 1967; Pollack, 1968), is an adaptive method 
for finding a desired level of an independent variable. In the present study, PEST was used to find the 
level of stimulus exposure (L,) (as far as possible notation is taken from Taylor & Creelman) which yields 
a pre-specified proportion of correct responses (referred to as the target probability, P). P, was in this — 
case set at a value which when corrected for chance performance gave a probability of a correct response of 
0-500. To calculate the P, value it was assumed that guessing responses would be equally distributed over 
all response alternatives (A), and as a consequence P, was set equal to 0-500+(0-500/A). For example, 
in the condition where subjects were required to locate a letter at one of the four clock positions 2, 5, 8 
or 11, P, =0-500+(0-500/4) = 0-625. 

The PEST procedure requires testing at a series of stimulus exposure times. For each exposure time a 
decision is made as to whether the probability of a correct response is greater or less than P,. The procedure 
provides a set of rules (see Taylor & Creelman, 1967, for details) which specify the size and direction of the 
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step which the experimenter must take in selecting the next exposure time. These rules effect a bracketing 
and gradual convergence on the desired level of exposure time, L,. The procedure used at any particular 
exposure time to test whether the probability of a correct response is greater or less than P,, was a Wald 
(1947) sequential likelihood-ratio test. That is, the subject is tested repeatedly, and a running total of correct 
responses is maintained. At the completion of each trial this total is compared with the previously computed 
sequential test bounds (details below), and if it is equal to or lies outside either the upper or lower bound the 
probability of a correct response differs from P, and a change in exposure time is called for. 

The computation of a Wald test is tedious (see Wald, 1947, pp. 88-105), but as pointed out by Taylor & 
Creelman its application is simple. ‘If the current testing level were exactly L,, the expected number of 
correct trials E(N(C)) = P,x T after T trials. The sequential test bounds are given by the expected number of 
events plus and minus a constant N, (C) = E(N(C))+ W, where N, (C) is the bounding number of events:after 
T trials, W is a constant, called the deviation limit of the sequential test’ (Taylor & Creelman, 1967, p. 783). 
The power of the test is directly reiated to the size of W, and a fairly large deviation limit of 4 was selected 
for use in the present investigation. 

When applying PEST in the present study, testing was begun with a level of stimulus exposure time (100 
msec) which (from previous experience) yielded a greater proporition of correct responses than P,, and it 
was assumed that a Wald test had previously indicated that a hypothetical exposure time of 116 msec be 
reduced by the maximum step size of 16 msec. The exposure time was reduced in steps of 16 msec until a 
level was reached which yielded a smaller proportion of correct responses than P,. The exposure time was 
then increased by half the previous step size (i.e. 8 msec). Testing at this new level could indicate that the 
exposure time be (a) reduced or (b) increased further. In the former case the step size was again halved (i.e. 
the exposure time was reduced by 4 msec), and in the latter it remained the same (i.e. a further increase of 
8 msec). The procedure was continued in this manner, i.e. by halving the step size for reversals, and by 
leaving unchanged or doubling (occasions when the step size should be doubled are specified by Taylor & 
Creelman, 1967) the step size for changes in the same direction, until a minimum step size of 0-5 msec was 
called for. At this point testing was terminated, and the resulting level of stimulus exposure time was taken 
as the point estimate of L,, i.e. the point estimate of the exposure time required to yield the target 
probability, P. 


Design 

The estimates of processing time required to locate, classify, and identify the two groups of four letters, 
were obtained from eight experienced subjects (all had 6/6 vision, and received $2 per hour). Four of the 
subjects were presented with the letters F, T, O, and Q, prior to F, f, Q, and q, and vice versa for the other 
four. This yielded a three-way design with presentation order for the sets of letters varied between subjects, 
and letter-set and type of judgement (location, classification, and identification) varied within subjects. Care 
was taken to present the conditions specified for later comparison in a manner which achieved equivalance 
for all subjects. This was accomplished by presenting the tasks in the order; location — classification > 
identification to two subjects from each group, and in the reverse order (identification — classification > 
location) to the other two, i.e. the classification judgement always occurred in sessions 2 and 5, and the 
order of the classification and the identification judgements was counterbalanced. One estimate of the level 
of stimulus exposure (L,) required to yield the target probability of correct responses (P), was collected in 
each experimental session. This required subjects to attend six experimental sessions each of 1 to 3 hr 
duration (there were two, 2 hr practice sessions at the beginning). ` 


Materials and apparatus 


Stimuli were constructed by mounting a Letraset (18 pt Grotesque 216) letter at one of four clock positions, 
and when positioned the letters were approximately 0-27? of visual angle high with a stroke width of 0-06". 
There were two sets of four letters, and since each letter appeared at each of the four selected clock 
positions it was necessary to prepare 2X4x4 = 32 stimulus cards. 

The fixation card had a black cross mounted at its centre, and the mask consisted of a jumble of broken, 
distorted, and whole Letraset letters. All cards were displayed in a Model GB Scientific Prototype 
Three-Channel Tachistoscope with channel luminence for fixation, stimulus and mask of 10-8, 17-2, and 38-7 
millilamberts respectively. 
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Procedure 


Subjects were instructed that for each of the three judgemental tasks (location, classification, and 
identification), only the information asked for was to be given in the response, and that when forced to guess 
they should attempt to give the possible alternatives equally often. During each practice session the 
judgemental tasks were presented in the experimental order with 40 min devoted to each, and one, 2 hr 
session was devoted to each of the two sets of letters (the order of these sessions depended on the subject 
group). 

When making a location or identification judgement the subject must make one of four possible responses, 
and if he selects an alternative at random when forced to guess, a non-guessing accuracy level of 50 per cent 
is represented by a performance level of 62-5 per cent. That is, the P, value for these two judgements was 
calculated at P, = 0-500+(0-500/4) = 0-625. A deviation of +0-064 on this target probability, and a deviation 
limit (W) of 4, specified a probability of a correct decision of 0-90 for each individual Wald test. When 
making the classification judgement the subject must make one of two possible responses. Consequently, P, 
was set at 0:750, and with a deviation of +0-051 and W was set at 4, the probability of a correct outcome 
from each Wald test was again 0-90. 


Results 


The mean levels of stimulus exposure time required for the three judgemental tasks and the two 
sets of letters are shown in Fig. 1. Observation of these results indicates (a) that exposure time 
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Figure 1. The average stimulus exposure time required to locate, classify and identify the sets of letters F, T, 
0, Q, and F, f, Q, q. 


increased as subjects made the judgements location, classification and identification and (b) that 
classification using letter-name requires a longer exposure time than classification using 
letter-shape. A three-way analysis of variance with letter-set (F, T, O, Q, and F, f, Q, q) and 
judgemental task (location, classification and identification) analysed as within-subject variables, 
and the order in which the sets of letters were presented analysed between subjects, supported 
the observations. That is, the effects of judgemental task, letter-set, and their interaction were 
significant (F= 66-37, d.f. — 2, 12, P< 0-001; F= 5-94. d.f. = 1, 6, P< 0-05; and F- 27:33, 
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d.f. =2, 12, P< 0-001 respectively). The interaction between the presentation order of the sets of 
letters, and letter-set also reached significance (F= 18-89, d.f. = 1, 6, P< 0-01), but since it was 
caused by subjects improving their performance during the experiment, it was of little 
theoretical importance. 


Discussion 
Development of structure 


It was pointed out in the introduction that the judgements made by the subject when locating 
one of the letters, F, T, O, Q at one of four clock positions, classifying it as angular or circular, 
and identifying it, require the build-up of progressively more detailed levels of representation. 
The results (Fig. 1) showed that stimulus exposure time increased directly with the level of detail 
demanded by the representation, and since masked brief exposures restrict the time available for 
processing (e.g. Wever, 1927; Sperling, 1963), it follows that the more detailed the level of 
representation required, the greater the period of processing needed. This finding is perhaps not 
surprising in the light of nearly 100 years experiente with tachistoscopes. It confirms the 
common experience here that the longer a subject looks at something the more he sees of it. On 
the other hand, it is still necessary to explain how this occurs. 

Assuming the two-stage model, there are two possible explanations of the data. The first of 
these supposes that the time course of forming a focal representation was the same for each of 
the judgemental tasks. It supposes that this representation was built-up gradually, so that, first of 
all, information about location was available to the subject, then general information about 
structure, and, finally, the specific structure necessary for identification. It assumes that the 
mask stops this build-up at the point where it occurs. Thus, if it occurred at the point between 
the differentiation of figure-ground and the emergence of the completed structure, the subject 
would be able (if asked) to locate the letter and possibly to classify it, but he would not be able 
to identify it. This explanation contrasts with the view implicit in current serial and parallel 
processing models of letter perception (e.g. Sperling, 1960; Sternberg, 1967; Egeth et al. 1972), 
that this is an all-or-none affair. 

The second explanation supposes that the different judgemental tasks direct the subject to 
construct different focal representations, and that these take different times to complete. Thus 
when instructed to locate the letter his task is to differentiate figure from ground, and judge the 
position of the figure relative to the clock. It is supposed that in this case the focal processing is 
directed to this end, and the data imply that this can be achieved relatively quickly. On the other 
hand, when required to classify the letter, processing must be directed to distinguishing it as an 
‘angular’ or ‘circular’ type of figure. (Its location might or might not be processed as well.) 
This, according to the data would take approximately 16 msec (this time includes pre-attentive 
processing). Finally, when required to identify the letter, his processing must derive a full 
structural representation. The data would imply that this takes a total of about 22 msec. 

The second explanation seems to be the one which would be favoured by Neisser (i.e. '..., 
one builds an appropriate visual object', Neisser, 1967, p. 94, my italics). It implies that when 
asked to judge a ‘lower level’ attribute, the subject does not perceive the higher level attributes 
of the figure. It conforms to our general notions of the way in which attention operates. 
However, there are grounds for rejecting it here. In the first place, subjects in this experiment 
were (at longer exposure times) frequently able to name the letter correctly even when asked 
only to classify it. Secondly, if this explanation were correct, it is difficult to see why the 
progressively higher levels of representation should have taken longer to achieve. After all, 
according to this view the identification of a letter as belonging to the class ‘angular’, is an event 
of precisely the same type as its identification as belonging to the class ‘T’. 

It is not suggested that the course of focal processing is always obligatory. In the present 
experiment there was a logical dependency between the levels of representation of the figure so 
that, for example, identifying a letter automatically classified it. Under these circumstances, it is 
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quite likely that subjects would choose to identify whatever the instructions. On the other 

hand, there are classes of pattern, for example, ambiguous figures, where alternative views are 
mutually exclusive, and a subject would be forced to choose between one view and another. The 
point is that whichever view the subject chooses to take, the process of constructing the 
structural representation requires time, and that this representation is built progressively at the. 
focal level. This means that if processing is interrupted, the subject still has available to him 


the early stages of the structure. 


Naming 


The nature of the judgements required in the case of letters from the set F, T, O, Q, was 
dictated by the assumption that the classification of a stimulus requires the initial build-up of a 
structural representation. However, the results of studies (reported in the introduction) by Brand 
(1971), Ingling (1972) and Henderson (1973) could be taken to suggest that non-structural 
attributes might somehow be extracted without prior identification of object structure. This 
possibility was tested by having subjects make location, classification, and identification 
judgements on letters from the set F, f, Q, q, where classification consisted of naming the letter 
(i.e. F, f vs. Q, q). If naming requires the same level of structural representation as that required 
for identification, then (a) estimates of processing time for naming and identifying a letter will 
not differ, and (b) naming will require more time than that required to specify the letter as 
angular or circular (letter-set F, T, O, Q). The results supported these predictions [t(critical value 
1-78) = 1:09, P>0-10; t (critical value 2-93) = 3-51, P< 0-01 respectively] suggesting that a 
detailed structural representation is required before naming can taken place. 
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Levels of mastery of the coordinators and and or and logical test 
performance 


Bo S. Johansson 


The purpose of the present experiment was to study the relation between mastery of the words and and or 
and performance on conjunctive and disjunctive tasks. A definition test was used to measure level of 
mastery of the linguistic terms and a traditional logical test to measure logical task performance. The tests 
were presented to subjects in the age interval 6-22 years. The results showed a close relation between ability 
to define and and or and pattern of performance on the logical tasks. This was taken to indicate that level of 
mastery of the words and and or was interrelated with ability to solve logical tasks. 


Recently a number of studies have been made of the development of children’s mastery of and 
and or (e.g. Neimark & Slotnick, 1970; Suppes & Feldman, 1971; Paris, 1973). But it has not 
always been made clear whether the purpose has been to study the words and and or, or the 
logical units and and or. The words seem to differ in fundamental respects from the logical units, 
and this difference seems to be of great relevance for research and theory in this area. 

In a linguistic analysis, Dik (1968) has brought forward strong arguments for such a distinction. 
The words and and or belong to the grammatical class of coordinators of which a main feature 
is that they combine two or more sentence parts which are equivalent as to grammatical 
functioning (see also Dik, 1968, ch. 4). The word and is a combinatory term, used to coordinate 
two or more sentence parts, and or an alternative term indicating a choice between the members 
of the coordination. Thus, the two words have semantic meanings, but these meanings are of 
low specificity, making it possible to use the words in many different constructions. This seems 
to be the case particularly for the word and; this word can be used in number-names, such as 
two hundred and five, and also to express temporal, causal, and logical relationships (Johnson- 
Laird, 1968). Thus, the unspecific combinatory meaning of and makes it possible to express 
many different relations with this word. 

The meaning of the logical units and and or, in contrast, is very specific, and these units can 
be used only in certain types of sentences. The logical and is used to combine two sentences and 
means that ‘both sentences of which the conjunction is formed are true’ (Tarski, 1965, p. 20). 
The logical or can be interpreted either as an inclusive or as an exclusive disjunction. ‘The 
[inclusive] disjunction of two sentences merely expresses that at least one of these sentences is 
true. ..the [exclusive] disjunction of two sentences asserts that one of the sentences is true and 
the other is false’ (Tarski, 1965, p. 21). Thus, the logical and and or can be used only between 
assertive sentences to which truth values can be assigned, and the context of the sentences is 
not allowed to affect the meaning of the logical expression. No such restrictions exist for the 
coordinators and and or in natural language. These words can be used to ‘combine questions, 
wishes, exhortations, and almost any kind of linguistic expression to which no amount of 
ingenuity can assign any truth value at all’ (Dik, 1968, p. 268), and contextual factors 
always contribute in determining the specific relation expressed by the coordinator used in 
the expression. 

Experimental results give some support for this distinction. Johansson & Sjélin (1975) have 
shown that the words and and or are both mastered at about the age of four, whereas studies of 
the logical and and or have shown that these units are not mastered until at high school age, and 
that or is more difficult than and (e.g. Neimark & Slotnick, 1970). Thus, the words seem to be 
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mastered earlier than the logical units, but the results give no clue as to the interrelation between 
speech and logic in development. The purpose in the present experiment was to gather data 
relevant for an analysis of the relation between speech and logic in this area of cognitive 
development. Three different tests were designed and used in the experiment; one test of word 
usage, one logical test, and one test of word understanding. The results from the test of word 
usage were found to be unrelated to the results on the other two tests, therefore, only the latter 
two tests are presented in the present report. 


Method 
Experimental design 


A within-subjects design was used to study how performance on the tests varied as a function of age. The 
logical test was always presented before the test of word understanding, this test order had to be used to 
avoid carry-over effects from the linguistic test to the logical test. The age intervals tested were 6, 7, 8, 9, 
10, 12, and 22. The ages 7 and 9 were added to the experiment to gather further data about the performance 
drop observed with the ten year old children. 


Subjects 


Ten subjects were tested in each age interval. The six year olds attended a kindergarten, the 7-12 year olds 
attended school, and the 22 year olds were college students. All children are best characterized as having a 
middle class background. 


General procedure 


Two experimenters cooperated during the testing. One experimenter interacted with the child and the other 
recorded the child's utterances and responses. The wording of the tasks in each test was tested in pilot 
studies and modified until the instructions were easily understood by the children. No feedback as to the 
correctness of the response was given. The order of the tasks within each test was rotated to counterbalance 
within-test learning. The subjects were tested individually in separate rooms in the local nurseries and 
schools. 


The definition test 


The purpose in setting this task was to study the development of understanding of the words and and or. 
The instruction was ‘Imagine that you are meeting a child from another country who does not completely 
understand your language. You are playing and talking with him. Now and then you say some word which 
the child does not understand, and you have to explain it. For example, if you are talking about horses, and 
the child asks you what the word ‘‘horse’’ means, what would you answer?’ 

When the child had supplied an acceptable definition of ‘horse’, he was asked to define and and or. If the 
child produced only an example, he was encouraged to explain why this example tells something about the 
meaning of the coordinator in question. Also incomplete explanations were followed up by questions to 
ascertain that the subject could adequately explain the meaning of the term. 

Vygotsky's (1962) theory of concept development can be used to predict how the children's definitions 
should change as a function of age. Vygotsky found that small children defined the meaning of concepts in 
terms of concrete examples, whereas older children were able to define the meaning of one concept in terms 
of other concepts. Vygotsky considered the first type of definition to indicate spontaneous mastery; the 
concept referred directly to practical experience and was not integrated with other concepts. The second 
kind of definition indicates conscious mastery; now the children are able to isolate the meaning of the 
concept from practical, concrete experiences, and can define the meaning of the concept without recourse to 
examples. On the basis of this hypothesis it may be predicted that small children should define the meaning 
of and and or in terms of examples, which can be taken to indicate spontaneous mastery. With increasing 
age an increasing number of children should be able to explain the meaning of and and or with no use of 
examples. Type of definition, then, can be used to indicate level of mastery. 


The logical test 


The purpose of this test was to trace changes in performance on a typical logical test as a function of age. 
Therefore, the present test was modelled after Neimark & Slotnick (1970) and Suppes & Feldman (1971). 
The material used was stimulus sheets with eight figures varying in shape and colour (see example in 
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Table 1. Example of a stimulus figure together with logical and linguistic interpretations of the 
four different commands (the wording of the commands is given in the text below) 








The stimulus figure 
"eoo 
Commands Logical Correct response Linguistic Correct response 
analysis analysis 





l and Intersection ps 

2. and Et eos P Sede 

e mes [ee] omm 
- ema] zo (sus 


Note. As a mnemonic to differentiate the commands the relation between word length of each command 
and correct response is suggested. Commands 1 and 3, the 'short' ones, are linguistically ambiguous but not 
commands 2 and 4, the ‘long’ ones. Correct logical and linguistic performance differ in the case of the 
‘short’ commands but not in the case of the ‘long’ commands. 





Table 1). The shapes were square, triangle, and circle and the colours were yellow, blue, and red. Two 
shapes and two colours were used on each stimulus sheet, but with different attributes on each so that all 
shape/colour combinations were balanced over commands. Two instances of each figure had to be used 
on the sheets to allow the use of plural verb and noun forms in all commands. 

When using ordinary language to define different sets, the idiom used have been found to affect 
performance (Suppes & Feldman, 1971). To obtain some control over this factor, it was decided to use two 
different wordings for each of and and or. They were: 

]. Encircle all figures that are blue and square. 

2. Encircle all figures that are blue, and all that are square. 

3. Encircle all figures that are blue or square. 

4. Encircle all figures that are blue, or all that are square. 

Each command was repeated twice, which means that each subject received eight different commands. 

The interpretation of these commands is dependent upon point of view when defining the meaning of and 
and or. Suppes & Feldman (1971) have interpreted commands similar to the present ones on the basis of a 
classification of and and or as logical connectives. According to their analysis command 1 is an intersective 
command, commands 2 and 3 inclusive disjunctive commands, and command 4 an exclusive disjunctive 
command. Correct responses, then, on the various commands should be: for command 1 the encircling of the 
two blue and square figures; for commands 2 and 3 the encircling of the figures with at least one of the 
attributes blue and square; and for command 4 the encircling of either all square figures or all blue figures. 

A critical difference between a logical and a linguistic interpretation is the weight given to factors outside 
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the command. In logic such external factors are excluded by definition, whereas in ordinary language the 
interpretation of a given utterance is a joint function of the linguistic information in the utterance and the 
situation in which it is used (Dik, 1968). In the present case the situation and the linguistic information does 
not force an unambiguous interpretation of all of the commands. Take, for example, command 3, this 
command may either be interpreted to mean that all figures with at least one of the attributes should be 
encircled (a response identical with inclusive disjunction), or that only figures with one of the attributes 
should be encircled (a response identical with exclusive disjunction). The first interpretation can be obtained 
if ‘blue and square’ is treated together, whereas the second interpretation can be obtained if a pause is 
inserted between ‘blue’ and ‘or square’. In command 4 ‘blue’ and ‘or square’ are more separated than in 
command 3, which facilitates the latter of the two interpretations. A similar interpretational ambiguity can be 
detected in command 1. This command may be interpreted to mean the encircling of the two blue and square 
figures (identical with intersection), or that all figures with at least one attribute should be encircled. 
Command 2 favours the latter interpretation. Table 1 summarizes and illustrates the above analyses. 

It may be predicted that if the development of logic is independent of speech, the commands should be 
interpreted as specified by Suppes & Feldman (1971) and the increase in logical performance should be 
independent of level of mastery of the words and and or. If the development of speech and logic is 
interrelated, responses to the commands should be found to follow the linguistic analysis and 
performance on the logical test should be dependent on level of mastery of the words and and or. 


Results 
The definition test 


The data presented include the answers obtained from the follow-up questions about incomplete 
examples and explanations. The following response categories were used: cannot, the subject 
refused to define the word or said that he could not; example responses, the subject provided an 
example but did not explain it (e.g. ‘and means you and me’); example with explanation 
responses, the subject provided an example followed by an explanation (e.g. ‘If someone asks 





And 

10 

9 

8 
S 7 O——O Cannot 
à : W-——N Example only 
Q 
$ 4 o——o Example with 
& 3 combinatory explanation 

A @——® Combinatory explanation : 

Or 

10 

9 
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ge 
B 1 Example with alternative, 
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k combinatory explanation 

1 €9—— —9 Alternative , or combinatory 


and alternative explanation 





Figure 1. The definition test. Number of responses in each category as a function of age for and and or. 
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you if you want an apple or a banana, or means that you can take only one of them’); and 
explanation responses, the subject provided an explanation of the meaning of the word not based 
on an example (e.g. ‘And is used to bring together things that have something in common’). A 
few subjects provided explanations of or that included both the combinatory and the alternative 
meanings. This was the case for seven out of 38 subjects in the example with explanation 
category and one out of 25 in the explanation category. To facilitate presentation of the data 
these subgroups were not separated in the figure. The results are given in Fig. 1. The main result 
seems to be the relation between the example with explanation and the explanation responses. 
The former response category was replaced by the latter around the age of ten. On the basis of 
Vygotsky’s interpretation of explanation responses, this may be taken to indicate that after the 
age of ten most children master and and or on a conscious level. Since conscious mastery means 
increased intellectual flexibility (Vygotsky, 1962) it may be predicted, on the basis of a 
hypothesis of a relation between speech and logic, that the children exhibiting conscious mastery 
on the definition test should solve the tasks in the logical test in line with the linguistic analysis 
given above. 

It may be observed that performance on the two words developed in parallel. This is a 
replication of the results on the development of spontaneous mastery of and and or reported by 
Johansson & Sjölin (1975), and contrasts with studies of the development of logical connectives, 
which usually show or to be much more difficult than and (e.g. Neimark & Slotnick, 1970). 

An examination of the content of the definitions showed that and was always defined as a 
combinatory term, e.g. ‘Is to be used when two things have something in common - are brought 
together’. Or was usually defined as an alternative term, e.g. ‘One is to choose between two 
alternatives’. Occasionally both the combinatory and the alternative meanings were stated, e.g. 
‘Of two things, one or both of them can be carried out’. These results are taken to indicate that 
the linguistic and can be considered a combinatory term and the linguistic or an alternative term 
(Dik, 1968). 


The logical test 


The following response categories were defined: intersective responses, only the two figures with 
both attributes were encircled; inclusive responses, all figures with one or both attributes were 
encircled; exclusive responses, either all figures with the one attribute or all figures with the other 
attribute were encircled; and other responses, the few remaining responses that did not fall into 
any of the above categories. 

A preliminary analysis showed that some subjects repeated the same response more or less 
throughout the test. Such repetitious behaviour makes the results for these subjects qualitatively 
different from the results for the subjects who varied their responses with the content of the 
command. Therefore, the results for the repeaters are presented in a graph separate from the 
results for the rest of the subjects. Each subject repeating the same response six times or more 
in a row was considered a repeater. The data are presented in Fig. 2. 

First, the results for the repeaters were analysed. The number of repeaters was very high in 
the youngest age group but dropped to zero with the nine-year olds. With the ten year olds there 
was again an increase in number of repeaters followed by a renewed decrease. The graph also 
shows that the young subjects tended to repeat the exclusive and the intersective responses, 
whereas six out of seven of the older subjects repeated the inclusive response. 

Next, the results for the non-repeating subjects were analysed. For command 1, the 
intersective response was the most frequent response followed by the inclusive response. 
According to the linguistic analysis both these responses should be obtained, whereas the logical 
analysis predicted only the intersective response. For command 2, the inclusive response 
dominated throughout all ages. This agrees with both the logical and the linguistic analysis. 

For command 3, the exclusive response was the most frequent one for young subjects, with 
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Figure 2. The logical test. The upper graph shows number of repeaters in each age interval and responses 
repeated. The remaining four graphs show responding on each command as a function of age for the rest of 
the subjects; the non-repeaters. 


an increasing frequency of inclusive responding for older subjects. The high frequency of 
exclusive responses seems difficult to explain on the basis of the logical analysis. For command 
4, the exclusive response dominated with all ages, and this result agrees with both the logical 
and the linguistic analysis. 

Thus, all main results agree with the linguistic analysis, whereas the results for commands 1 
and 3 seem difficult to explain adequately from a logical point of view. In addition, around the 
age of ten a dramatic performance change was observed for both the repeaters and the 
non-repeaters. This seems to parallel the performance shift observed with the definition test. 
This indicates a relationship between performance on the two tests. A series of intra-individual 
analyses were run to find out if such a relationship existed. 
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Table 2. Intra-individual relation between usage of explanation response on definition test and 
repeating/non-repeating responses on logical test for young (6-9) and old (10-12) subjects 





Responses on logical test 


Young subjects Old subjects 
Responses on Repeat Non-repeat Repeat Non-repeat 
definition test (n=17) (n= 23) (n=7) (n= 13) 
Explanation, and 2 6 2 9 
Explanation, or 0 3 2 9 
Remaining, and 15 17 5 4 
Remaining, or 17 20 5 4 


Table 3. Intra-individual relation between usage of explanation response on definition test and 
performance on commands 1 and 3 for young (6-9) and old (10-12) subjects. Only subjects not 
repeating responses on the logical test are included 





Responses on logical test 


Young subjects Old subjects 
Responses on — es 
definition test Inters. Excl. Incl Other n Inters. Excl. Incl. Other n 
Command 1 
Explanation, and 7 0 5 0 6 13 0 3 0 8 
Explanation, or 4 0 2 0 3 17 0 I 0 9 
Remaining, and 12 5 12 1 17 9 0 1 0 5 
Remaining, or 18 5 16 I 20 5 0 3 0 4 
Command 3 
Explanation, and 4 10 2 0 8 0 11 5 0 8 
Explanation, or 0 4- 2 0 3 0 10 6 2 9 
Remaining, and 4 22 4 0 15 2 3 3 2 5 
Remaining, or 8 28 4 0 20 2 4 2 0 4 


First, the relation between response used on the definition test and tendency to repeat was 
analysed. The 22 year old subjects were excluded since they used only explanation responses 
and showed no tendency of repetitious behaviour. The relevant data are given in Table 2. The 
results show that those using an explanation response to define and and or exhibited very little 
repetitious behaviour on the logical test, whereas those using some other response to define and 
and or exhibited about as much repetitious as non-repetitious responding. The tendencies in the 
two age groups were very similar. Statistical tests on data summed over ages showed this 
intra-individual correlation to be significant in the case of or, x? (1) = 4-12, P< 0-05, but not in 
the case of and, x? (1) 2 3-09, P» 0-05. This analysis, then, indicates that a low degree of 
mastery of particularly or, as measured by the definition test, was strongly related to repetitious 
responding on the logical test. 

Next the intra-individual relation between usage of explanation responses and performance on 
commands 1 and 3 was analysed. The results on the other two commands were disregarded since 
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they did not. differentiate between the logical and the linguistic analysis. This analysis was 
carried out only for the subjects not repeating responses on the logical test. The college students 
were excluded since they provided only explanation responses on the definition test. The data 
are given in Table 3. The results for command 1 show that the subjects using explanation 
responses concentrated on the two responses acceptable from a linguistic point of view; the 
intersective and the inclusive, whereas the young subjects defining and and or by other means 
also produced exclusive responses. A similar tendency was obtained also for command 3; the 
subjects using explanation responses to define and produced only four unacceptable responses, 
and those defining or by explanations only two unacceptable responses, versus eight and ten, 
respectively, for the subjects using some other response to define the two words. The result for 
or was significant (P< 0-05, sign test) but not the result for and. 


Discussion 


The main result seems to be the quite strong correlation between level of mastery of the word 
or, but also and, and quality of performance on the logical test. Many of the subjects defining 
and and or with recourse to examples tended to repeat the same response throughout the logical 
test or produced responses unacceptable from a linguistic point of view on commands 1 and 3. 
The subjects using explanations without recourse to examples to define and and or, exhibited a 
much higher level of performance on the logical tasks. These results, then, indicate that 
performance on the logical test is closely tied to level of mastery of and, and, in particular, of 
or. 

An interesting finding in both the present and the former experiment (Johansson & Sjólin, 
1975) was that the words and and or were not found to differ in degree of difficulty. This 
contrasts with results from experiments on and and or as logical units. Such experiments show 
conjunctions to be much easier than disjunctions (e.g. Neimark & Slotnick, 1971; Paris, 1973). 
The present findings, however, suggest that the discrepancy between these two series of results 
may be due to the use of the meaning of the words and and or when solving the logical tasks, 
and that the meaning of and, but not the meaning of or, facilitates logical test performance. This 
suggestion can be analysed with the help of the present logical test results. First, consider the 
two and commands. According to Suppes & Feldman's (1971) logical analysis, command 1 is an 
intersective command and command 2 an inclusive disjunctive command. The results show that 
performance, from a logical point of view, increased at about the same rate for both commands. 
Thus, the fact that and had to be given quite different interpretations in the two commands had 
very small effects on level of performance, logically defined. This seems to support Dik's (1968) 
claim that the unspecific combinatory meaning of the word and allows the use of this word to 
express many different types of combinations. 

For the two or commands, the results were different. According to the logical analysis, 
command 3 is an inclusive disjunctive command and command 4 an exclusive disjunctive 
command. From this point of view command 3 must be considered much more difficult than 
command 4, since the number of inclusive responses were very few for subjects younger than 
ten on command 3, whereas exclusive responding was common on command 4 in most age 
intervals. What, then, is the cause of the ‘difficulty’ of command 3? The answer appears 
straightforward: or was interpreted as an alternative word by young subjects, therefore they 
gave exclusive responses to command 3. This means that or may have a very restricted 
alternative meaning for young subjects and that these subjects have great difficulties in detecting 
that or commands may be given also an inclusive interpretation. The results also show that the 
subjects older than ten started to give inclusive responses to command 3. The reason for this 
change may be the increase in level of mastery of the word or. To find out if this was the case 
an analysis was made on the intra-individual relation between tendency to produce explanation 
responses to or on the definition test and the use of inclusive responding on command 3 in the 
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logical test. This analysis showed that the subjects using explanations to define or produced 
significantly more inclusive responses, x? (1) = 8:32, P< 0-01, than the subjects defining or by 
other means. Thus, the increase in use of inclusive responding was closely related to the 
increase in level of mastery of the word or. 

This indicates that the mastery of or on a spontaneous level may make the solution of 
disjunctive tasks very difficult, whereas no equivalent difficulties may exist in the case of the 
word and and conjunctive tasks. Thus, it is suggested that one reason for the obtained difference 
in degree of difficulty between conjunctive and disjunctive tasks is to be found in this difference 
in transfer from speech to logic. 

This line of reasoning throws some doubts on the use of logical tasks for the study of 
differences in intellectual development between deaf and hearing subjects (e.g. Furth, 1971). The 
reasoning behind these experiments seems to have been that speech should either facilitate 
intellectual development or have no effect on it. The findings obtained by Furth have been 
considered as support for the second alternative; that intellectual development is independent of 
speech. But the present results suggest a third alternative; intellectual development is related to 
speech development, but in the case of logical tasks the transfer from speech to logic may be 
negative in the case of disjunctions, particularly if or is mastered on a spontaneous level. 

Thus, the present results seem to give strong support for the view that speech and thought 
are interrelated. Such a view has been brought forward by Vygotsky (1962). He considers 
development of conscious mastery of concepts a crucial step in the development of intellectual 
operations. The present finding that quality of performance on the logical test was closely related 
to the ability to define the words and and or with explanations are taken to agree with 
Vygotsky’s (1962) theory. 

This interpretation contrasts with Piaget’s view of the role of language in intellectual 
development. According to Piaget (e.g. 1954) development of logical thought is dependent on 
operational thought, and operational thought originates from actions. These operations develop 
independently of language. The role of language is to facilitate the use of the operations 
and to enable interpersonal communication. According to this line of reasoning the development 
of the ability to solve logical tasks should not be dependent on the ability to define the meaning 
of the words and and or. The contrary findings in the present report indicate that Piaget 
may have underestimated the role of speech in intellectual development. But this is not a 
denial of the role of actions as a main factor in intellectual development, it is only suggested 
that speech and thought may be more closely related than envisaged by Piaget. 
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A technique for converting ranks into measures 


Robert Wood and Douglas T. Wilson 





A method for converting ranks into measures is proposed. It should be useful whenever it is necessary to 
amalgamate ranked data with interval data. Being based on the ranking of differences between subjects, the 
method is able to utilize whatever information a judge is able to provide. Suggestions are made for dealing 
with data which are oddly distributed. 





In an article in this journal Cook & Smith (1974) made out a case for preferring ranking to 
rating. The virtue of ranking, they felt, and we agree with them, is that it eradicates differences 
in level and spread which in an analysis of ratings are simply nuisance factors. They did, 
however, mention two shortcomings of ranking. The first, that judges may be forced to 
differentiate, can be got over by allowing ties. The second — more serious - is that ordinarily 
ranking takes no account of the size of the differences between different pairs of subjects. The 
purpose of this article is to describe a method of allocating measures to ranks - a problem which 
Cook & Smith did not consider explicitly - which does take account of size of differences, 
indeed depends on estimates of these differences being available. We feel the method has most 
utility with small groups of subjects where such information is likely to be available. An 
application involving teacher assessments of students' achievement is given. 


Method 


The method to be described is due to Kendall (1962). The idea is very simple. Suppose that not 
only can judges rank subjects but that they can also rank differences between subjects and 
beyond that differences of differences, etc., then any task involving n subjects can ultimately be 
reduced to a single pair of differences which can be ranked 1, 2 or 2, 1. In practice, it is doubtful 
whether judges would be able to rank beyond first-order differences but this does not mean that 
the method is thereby rendered useless. It will still work perfectly well, indeed we will illustrate 
how it works in this very common situation. 

The statistical rationale of the method depends on a result developed by Daniels (1962). If a 
magnitude is broken randomly into n parts the expectation of the parts in descending order is 


It follows that if differences could be reduced down to the last pair, which means two parts, the 
values assigned would be !4(14-- 4) or % and %.% or %. Dropping the denominator for 
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convenience - the unit of measurement is arbitrary, a point we shall come back to - the starting 
values from which rank measures are to be constructed turn out to be 3 and 1. The following 
extended illustration should clarify the method further. 

Suppose there are five subjects to be ranked. Then, applying the above fos, the 
expectations of the four ranked first-order differences will be 


1,7 B 23 o i 
16 48 48 48 
3, 7, 13, 25. 


Leaving the question of metric until last, we now wish to construct values for the five subjects, 
given the magnitude of the differences between them. The total of the four numbers is 48 and 
their average is 12. The numbers 


12, 1243, 12410, 12423, 12+48 or 
12, 15," 22, 35, 60, 


therefore meet our requirements. 

All that remains is to place the values on a metric. This requires the existence of some related 
scale on which the subjects concerned have already been placed, or else, as Kendall suggests (p. 
137), the presence in the group of subjects who have already been placed on the scale and who 
can act as markers. Being obliged to calibrate by some other scale, however related, might be 
thought a weakness of the method and to some extent it is, but there is a limit to what 
bootstrapping can do and any method which seeks to convert ranks to usable measures must 
inevitably rely on other scales. Besides, the only equating which is done is to tie the extremes 
of the chosen scale. Insofar as the range of the distribution of measures of behavioural traits in 
any group might be assumed to be much the same, this does not seem an overly rash step to 
take, although empirical verification will be necessary. 

Let us imagine, for the sake of argument, that the five subjects have been measured on some 
other scale and that the measurements are 


9, 17, 37, 54, 68. 


To fit the concocted values to this scale they must be adjusted so that they are distributed across 
the range 9-68 without disturbing their proportional relationships. This involves multiplying the 
differences by the fraction 59/48, this being the ratio of the two ranges. After some arithmetic, it 
will be seen that the scaled values we seek are (approximately) 


9, 13, 22, 37, 68. 


Should it be possible to rank differences beyond the first order, the method of constructing 
values proceeds exactly as above, except that there are more steps. Kendall provides a more 
extended example. 


Validation $ 


A test of Kendall’s method is how well it recovers mark values already assigned to subjects. 
Some data of this kind will be presented next. We take the opportunity to compare two 
competing methods of converting ranks into measures, what we call the rank swap method and 
the normal scores method. 

Assuming again the existence of a related scale on which subjects have been measured, the 
rank swap method proceeds by transferring the highest scale value to the subject ranked first, 
the second highest value to the second ranked subject, and so forth until all the scale values 
have been mapped onto the ranking. Thus it goes a stage further than the Kendall method in that 
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it uses not just the extreme related scale values but the entire distribution. In effect, the assigned 
values function as order statistics from an unknown empirical distribution defined by the related 
scale measures. The rank swap method is convenient, numerically, because the same set of 
values is preserved so that the descriptive statistics automatically are equated but it suffers from 
the drawback that a subject has someone else's score - from a different, even though related, 
test or judgement - imposed on him. Where the related scale is, shall we say, an external 
examination and the ranking is by the teacher, there is the danger that an unexpectedly poor 
examination mark obtained by candidate X will be transferred to candidate Y, who may then 
consider himself victimized. 

As an alternative to using empirical order statistics, it is possible to replace ranks by order 
statistics from a known distribution like the normal or perhaps one of the gamma distributions. 
Normal order statistics, which are expected values of random samples of size n drawn from a 
normal distribution, are tabulated by David et al. (1968). If one were confident that judges 
expected subjects to fall into a more or less normal distribution then one might be quite ready to 
use normal scores routinely although with small groups such confidence might be misplaced. At 
any rate, this is a matter for empirical verification and the samples to be presented next will give 
some idea as to how well normal scores work in practice. It will be appreciated that, like every 
other technique, normal scores require the existence of a related scale, against which they can 
be calibrated. If this were not done all those ranked in the same position in groups of the same 
size would receive identical values, regardless of differences in level of the trait being measured. 


Analysis 

The original data comprised assessments by teachers, in mark form (scale 0—100), of their 
students' competence in practical Advanced level Chemistry work. From the marks rank orders 
of students and differences between students were extracted. Although ranking of differences 
could have been continued until only two values were left, in the interests of realism ranking 
was discontinued after the first-order differences. For the purposes of calibration, external 
examination marks were available. 

Six small data sets (n « 10) were singled out for special attention. Figure 1 shows the results of 
plotting teachers’ marks against (a) values reconstructed by Kendall’s method; (b) values derived 
from the rank swap method; and (c) normal scores suitably scaled. The test is how well the fitted 
values recover the teachers' marks. 

An overall scrutiny of the six plots shows that the Kendall method gives the best fit, followed 
by the normal scores with rank swap a poor third. Evidently the weakness of rank swap is that it 
fails to reflect distinctions judges wish to make among subjects. Wherever there are ties in 
related scale values (examination marks) subjects are forced to be tied even though the judge 
(teacher) wishes to separate them. The resulting lack of fit is very apparent in plot (c) but is also 
visible in other plots. 

The fact that normal scores compare quite favourably with the Kendall method suggests that 
judges may well behave as if the traits they are ranking are normally distributed. This is true 
even in plot (a) where n —3. If judges are unable to rank first differences or are unable to supply 
any information beyond a plain ranking, it would seem reasonable to substitute normal scores for 
ranks. 

That all three methods will sometimes concur is shown by plot (d). Our experience is that as n 
grows larger, the agreement between the methods continues to improve. For n> 15 we would 
think one method is as good as another. However, since we are interested above all in a small 
groups technique, and in making use of the extra information about small groups which judges 
may be expected to convey, this result is of academic interest only. 

Although the Kendall method gives a good fit, it might be asked whether in certain cases, e.g. 
plot (e), the fit could not be improved. One possibility, suggested by Kendall, is to ‘pin’ at inner 
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Figure 1. Plots of constructed scores against criterion scores for three methods of converting ranks into 
measures. x, Kendall metric; 6, rank swap; O, normal scores. 


values of the related variable rather than at the extreme values, the rationale being that where 
there are outliers or stragglers, the measured range gives a distorted estimate of the ‘true’ range. 
Ignoring one or both of the extreme values - ‘trimming’ as it is known in the statistical 
literature — and pinning the constructed values at the new extremes may lead to a better fit 
between constructed and criterion values. 

Of the plots shown in Fig. 1, it is plot (e) which is most in need of dienen The data 
consist of a front runner, three widely spaced out stragglers and a block of values in the middle. 
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Pinning at end points 


Teacher score 





Teacher score 
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Figure 2. Effect on Fig. 1, plot (e) of pinning at the top and second bottom points. 


The top plot in Fig. 2 is a repeat of plot (e) while the bottom plot shows the effect of ignoring 
the lowest criterion value and pinning on the highest and second lowest values. Evidently the 
trimming strategy has worked, not so much by altering the position of the bottom values as by 
straightening out the middle block and augmenting it with one of the stragglers. No further 
improvement in fit was observed when the top value was also trimmed. The same was true when 
differences were reduced down to two values and values were constructed. It would appear that 
first-order differences will suffice quite well. 


Discussion 

The method presented offers a means of utilizing whatever information a judge is able to supply 
about subjects. Even if the judge can say no more than that one subject is head and shoulders 
above the rest, who are more or less equivalent except for one straggler, that information can be 
turned to advantage, producing a more refined scale than would result if the judge were forced 
to provide a rank order. For the method to work best, an adaptive strategy for securing the 
extremes of the constructed values is desirable. Where outliers at either end of the range are 
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suspected, pinning at inner values of the related variable should be carried out. It may well be, 
and empirical experience only will show this, that trimming and inner pinning should be carried 
out as a matter of course. With some data more radical action may be called for, as when the 
data split into two distinct and more or less equal sets, in which case it may be necessary to fit 
each set separately. 

Although not all the bugs have yet been ironed out, we believe this technique to be feasible, 
whether carried out with the aid of a pocket calculator or in an interactive computing mode. Of 
the questions which remain to be answered perhaps the most important one concerns the choice 
of a related scale. Work is now going on to ascertain what the effect is of substituting one scale 
for another. 
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The influence of secondary depth cues on the understanding by Nigerian 
schoolboys of spatial relationships in pictures 


J. R. Nicholson and G. M. Seddon 





The ability of 105 Nigerian schoolboys to interpret pictures three-dimensionally was studied as a function of 
changing the number of depth cues on moving from monochrome photographs incorporating the effects of 
shadow, through fully cued line drawings, to line drawings containing only elevation cues. The experiment 
also investigated the effects of different amounts of previous formal training in technical drawing on 
performance in these tasks. 

The results showed a significant increase in performance as the number of depth cues increased above that 
contained in the minimum cued line drawings. However, the difference between the effects of fully cued line 
drawings and the photographs was not significant. Neither the main nor interaction effects relating to levels 
of formal training in technical drawing were significant. 





In teaching scientific and technical subjects at the secondary or tertiary levels, pictorial 
representation is a major concern. It is therefore important to produce the most effective 
pictures for different types of student, and to develop an understanding of the factors within the 
student and the pictures which determine a student’s ability to understand pictures. 

The problem is of considerable significance in Africa where, according to the review by Miller 
(1973), a number of investigators have concluded that many people have difficulty in 
comprehending the portrayal of depth in pictures produced according to Westernized styles of 
representation. In particular it has been maintained that many Africans regard the spatial 
relationships depicted in these pictures as referring to two dimensions rather than three (e.g. 
Hudson, 1960, 1962 a, b; Mundy-Castle, 1966; Deregowski, 1968, 1969, 1971). 

Investigations into the effects of differences in pictures have concentrated on studying the 
effects of changing the number and nature of the depth cues incorporated into the pictures (e.g. 
Oh, 1968; Wilcox & Teghtsoonian, 1971; Jahoda & McGurk, 1974a, b; McGurk & Jahoda, 1974, 
1975; Olson, 1975). However, as pointed out by Jahoda & McGurk (19745), the experiments by 
Wilcox & Teghtsoonian (1971) and Oh (1968) have not manipulated the changes in pictures so as 
to provide meaningful answers to this question. The results of the various experiments of Jahoda 
& McGurk indicate that there are different effects for the ability to understand size relationships 
on the one hand and spatial relationships on the other. Thus increasing the number of cues 
improved the ability to understand size relationships in children over the age range 4-13 in all 
the different cultures studied (Jahoda & McGurk, 1974 b, c; McGurk & Jahoda, 1975). However, 
in the case of understanding spatial relationships the effects of increasing the number of cues 
had a deleterious effect on the performance of Scottish, Rhodesian and urban Hong Kong 
children (Jahoda & McGurk, 1974 a, b) but had no effect on the performance of Ghanaian 
(McGurk & Jahoda, 1975) or Hong Kong boat children (Jahoda & McGurk, 1974 c). Olson (1975) 
found that the nature of the cues which were introduced, as opposed to their number, was also 
significant with four and five year old American children. 

Rather surprisingly little relevant research seems to have been carried out with secondary or 
tertiary level students in Africa, and it is not clear how the findings of these earlier experiments 
relate to secondary and tertiary level students. It was therefore the aim of the present 
investigation to determine how the ability of African secondary students to understand pictures 
three-dimensionally changes, as the number of different types of depth cue increases in carefully 
defined stages. As a second aim the experiment was to investigate the existence of interactions 
involving the different types of picture and differences in the amount of formal training which 
people have had in understanding these pictures. 
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Method 

The basic plan was to devise a test of the ability to interpret pictures three-dimensionally and to compare the 
mean scores of groups of students who had been randomly allocated to take one of three different versions 
of the test, each differing from the others in the number of different types of secondary depth cues 
incorporated into the pictures. At the highest level of cueing the pictures were to be monochrome 


photographs incorporating the effects of shadow (see Fig. 1a). At the next level the pictures were to take the 
form of line drawings incorporating relative size, overlap and linear perspective cues, as well as elevation 


Secondary depth cues present 


Elevation 

Line convergence 
Overlap 

Relative size 
Texture 

Shadow 


Elevation 
Line convergence 
Overlap 


Relative size 


Elevation 








Figure 1. The pictures containing different sets of secondary depth cues used in the three versions of the 
test. 
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cues relating to differences in the positions of objects in the picture (see Fig. 1b). At the lowest level the 
pictures were to be line drawings containing elevation cues only (see Fig. 1c). Differences in the levels of 
previous training were to be investigated by testing on the one hand, students who were taking technical 
drawing as a major special course of study, and, on the other, students who were not. 


Subjects 


The subjects were 105 boys from the fourth year of a secondary school in Lagos, Nigeria. Their ages ranged 
from 13 to 20 years. (Mean = 16-0, s.D. = 1-2.) Of these 45 had been studying technical drawing since 
entering the school, whereas the remaining 60 had not. The choice of subjects had been determined by the 
boys’ preferences, not by examination performances. All the boys were being prepared for the examination 
leading to the West African School Certificate. 


The test 


The purpose of the test was to measure a student's ability to interpret pictures three-dimensionally by 
requiring him to recognize spatial relationships portrayed in the pictures. Furthermore, since the 
investigation as a whole has a distinct educational orientation, it was considered important to design the test 
so as to be as relevant as possible to situations where pictures are employed in an educational context, e.g. 
as in reading a textbook. Usually this situation requires the student to relate a written passage in the text to 
the pictures. For example, the text might make frequent references in words to the spatial relationships 
portrayed in the picture. It was therefore concluded that the test should comprise verbal questions which 
required the student to demonstrate his understanding of the depth cues. Thus the whole task requires the 
student to understand the meaning of both the diagram and the words. 

Since previous work suggests that a common mistake is for the students to interpret pictures entirely in 
two-dimensional terms, the rationale underlying the construction of each item was to determine whether the 
student understood the picture in three- or two-dimensional terms. The nature of the pictures and questions 
was to be such that if the student interpreted the picture three-dimensionally, he would make one particular 
response, whereas, if he interpreted it two-dimensionally, he would make a different response. 


The pictures. It was decided to construct the test within the context of chemistry teaching, where it is 
particularly important for students to understand pictures used in textbooks to describe three-dimensional 
crystal and molecular structures. In three dimensions the structures are represented as models made up of 
balls and rods. As Fig. 1 shows, in two dimensions these models are portrayed as photographs or as line 
drawings comprising small circles and straight lines. A particularly important advantage of using such a 
limited range of pictures is that it is possible to control the number and nature of the depth cues without 
introducing new artistic symbols. If several different types of pictures employing additional symbols had 
been used, it could have been quite difficult to avoid confounding changes in the configurations of depth cues 
with concomitant changes in the nature of these additional symbols. 

The models which formed the basis of all three test versions were made with standard components all 
readily available from commercial suppliers. The frameworks were constructed from plastic tubing and metal 
connectors contained in the Framework Molecular Models Kit produced by Prentice-Hall. The balls were 
made of expanded polystyrene. All the models were made up of rods 12 cm in length and balls 1-8 cm in 
diameter, and as shown in Fig. 2 the frameworks were based on cuboids or hexagonal prisms. 

In all the pictures the angles of view for all cuboid and hexagonal prisms were identical to those illustrated 
in Fig. 2. According to the principles discovered by Hochberg & Brooks (1960) these viewpoints maximize 
the perceived three-dimensional character of pictures of this kind of framework. 

The cued line drawings (i.e. those retaining line convergence, overlap and relative size cues) were obtained 
by tracing over the outlines of the rods and balls in the photographs. Particular care was taken to ensure that 
in giving an overlap cue the straight lines overlapped the circles to the same extent as did the images of the 
rods in the corresponding photograph. 

The diagrams with minimum cues were produced by removing the line convergence, overlap and relative 
size cues. The circles drawn on front and rear faces of the framework are equal in size; the extremities of 
the lines do not overlap the edge of the circles; all lines depicting parallel rods in the corresponding 
three-dimensional structure are drawn parallel to each other in the picture. Thus the final minimum cued 
diagrams contained only elevation cues. 

After labelling, the line drawings were subsequently reproduced in the test booklets by offset lithography. 
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Type 1 question: (a) Which ball, 4 or 6, is nearer to the vertical plane PORS ? 
(b) Which ball, 1 or 2, is nearer to the vertical plane PORS ? 


Type 2 question: (a) Which ball, 2 or 3, is lower,on the framework ? 
(b) Which ball, 5 or 6, 1s at the same height on the framework as ball | ? 





Type 3 question:(a) Which ball, 3 or 6, is nearer to ball 5? 
(b) Which ball, 6 or 7, 1s further from ball 5 ? 


Figure 2. Examples of the different types of framework and questions used in the test. 


The labelled photographs were mounted in the test booklets as glossy prints. The line drawings were of 
a quality similar to those illustrated in Fig. 1. In each test version the frames surrounding the 
pictures were 7 cmx9 cm in size, and the pictures of Fig. 1 are to scale. 


The questions. The three types of question emphasized horizontal and vertical relationships. Type 1 and 
Type 2 questions concerned only horizontal or vertical displacements respectively. Type 3 questions required 
the students to manipulate horizontal and vertical displacements simultaneously. 

Tn the early versions of the test the questions did not offer the student the choice of just two alternative 
answers. For example, the initial version of question 2a was ' Which ball is lowest on the framework?' 
However, in the pilot study it was found that the vast majority of the students confined their choice to just 
two of the possible responses - namely the answers which would be expected if the picture was interpreted 
three-dimensionally on the one hand and two-dimensionally on the other. Hence in order to facilitate the 
scoring procedure later versions of the tests presented the questions in the two-choice format illustrated in 
Fig. 2. 

The form of words for each type of question was kept identical, and the only change made concerned the 
numerical labels of the balls under consideration. Thus changes in wording could not contribute to changes 
in a student's performance. 


Structure. After undergoing one round of pre-trials the final test contained 20 different pictures associated 
with a total of 60 questions. The breakdown into combinations of cubic/hexagonal prism frameworks and 
Type 1/Type 2/Type 3 questions is summarized in Table 1. 
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Table 1. Distribution of the items among the different types of question and framework 





Type of question 





Framework 1 2 3 
Cuboid 10 10 14 
Hexagonal prism 10 10 6 





Total number of questions = 60. 


Administrative procedure. The test booklet was divided into four sections in order to alternate the different 
types of framework. As an introduction to each section of the test the student read a short paragraph which 
emphasized that the picture portrayed a three-dimensional object. For example in the case of Section 1 the 
introduction was as follows: ‘The photographs in this section show a cuboid framework. It consists of two 
cubes joined together at points P, Q, R, and S. The cuboid is shown sitting on a level surface, i.e. points P 
and Q are in a horizontal plane. Balls which are numbered 1 to 6 are attached to the frame.’ The insertion of 
this introductory paragraph was considered important because, as Miller (1973) points out, a student could 
well understand the three-dimensionality of pictures and yet still respond literally (i.e. two-dimensionally), if 
he was uncertain as to which interpretation was required. Research by Carlson (1966) as well as by 
Liebowitz & Harvey (1967, 1969) has also demonstrated the importance of minimizing this kind of ambiguity 
in the instructions for tests of visual perception. 

There was no time limit for the tests, and using the specially prepared answer sheet all students finished in 
less than one hour (Mean time = 36 min, s.p = 7-3 min). 

Before administering the test to subjects whose mother tongue was not English, it was considered 
important to establish that they understood the meaning of the questions. Hence as a preliminary study a 
sample of 96 comparable students took a different version of the same test in which each picture was 
replaced by the corresponding model. This test was administered on an individual basis using specially 
designed boxes containing windows which allowed the student to look at the model from exactly the same 
viewpoint used in taking the photographs. Thus if a student answered the questions correctly, it was 
assumed that he understood the meaning of the words used in the question. The results showed that the 
score distributions were negatively skewed, they had a mean of 53-5 and a standard deviation of 0-9. It was 
therefore concluded that the students to be involved in the main part of the experiment would have an 
adequate understanding of the words in the questions contained in the final version of the tests containing . 
pictures. 


Results 


Table 2 summarizes the means and standard deviations for the various subgroups of the whole 
experimental design. It also shows for each subgroup the number of students who failed to score 
40 - the score which would be equalled or exceeded by only 1 per cent of a sample of purely 
random guessers. 

The analysis of variance revealed a main effect for differences in test versions which was 
significant at the 0-01 level (F= 5-77, d.f. — 2, 99). The main effect due to differences in 
experience of technical drawing was not significant at the 0-05 level (F= 1:71, d.f. = 1, 99). 
Neither was the interaction effect (F = 0-32, d.f. = 2, 99). 

Having planned at the outset to investigate the nature of the effects due to increasing the 
number of depth cues, the significance of two corresponding orthogonal comparisons was 
tested. Thus in one comparison the mean score obtained with the minimum cued diagrams was 
significantly lower (P< 0-01) than the combined mean scores for cued diagrams and the 
photographs (t= 3-23, d.f. — 99). In the second comparison the mean score for the cued diagrams 
was not significantly different (P> 0-05) from that of the photographs (t= 1-04, d.f. = 99). 
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Table 2. Summary of the results obtained for the different groups of students 
ee eT or 





Type of Fully cued Minimum cued All 

student Photographs line drawings line drawings versions 

se IR ER O ee ees 

Technical X 544 53-9 48-0 52-1 
drawing S.D. 6.6 8-9 12-2 10-0 
specialists No. scoring 2 1 4 7 
(n= 15 in less than 40 
each group) 

Not technical X 54-1 49-9 44-3 49-4 
drawing S.D. 71 : 10-1 13-8 11-4 
specialists No. scoring 2 4 7 13 
(n= 20 in less than 40 
each group) 

All groups X 54-2 51-6 45-9 50-5 

S.D 6:9 9-8 13-2 10-9 
No. scoring 4 5 11 20 
less than 40 
Discussion 


Table 2 shows that even in the group with the minimum of depth cues in the pictures the vast 
majority of the students can be considered to be at a level which is significantly greater than that 
which would be expected by chance. It may be concluded therefore that most of the students 
had a considerable level of understanding of the portrayal of depth in all three types of pictures. 
The performance of the group using the minimum of depth cues parallels that of McGurk & 
Jahoda (1974), who found that elevation cues alone were sufficient to create a significant level of 
understanding; It will also be noted that the level of understanding in the case of the 
photographs produced a mean score of the order of that which in pilot studies had been obtained 
for the test version incorporating models instead of pictures — although in the latter instance the 
variability in scores was much less than in the former. 

The analysis of variance and subsequent orthogonal comparisons demonstrate that the ability 
to understand these spatial relationships increases as the number of depth cues increases above 
the minimum set which can be provided. The results also suggest that the cues of texture, shade 
and shadow, specific to the photographs, were not nearly so important as the cues which 
distinguish fully cued line drawings from the minimum cued line drawings. It is interesting to 
note that, as a consequence of depleting the depth cues, the Necker inversion may be readily 
experienced with minimum cued line drawings. Although the nature of the correct answer is 
unaffected by whichever Necker form is perceived, it is conceivable that the occurrence of 
spontaneous reversals contributes to the difficulty in recognizing spatial relationships in these 
diagrams. 

The lack of significant main effects relating to the amount of previous technical drawing 
experience is to be contrasted with the conclusions of previous research that performance 
increases with amount of previous relevant education (e.g. Hudson, 1960, 1962; Kilbride, 
Robbins & Freeman, 1960; Mundy-Castle, 1966). While the correct explanation for this 
discrepancy is not apparent, it may be that with secondary school students, who are already in 
the top few per cent of the population as regards general academic ability, the differences 
produced by specialized training in technical drawing are just not large enough to be detected 
using this test. It must also be remembered that the students were not allocated to the technical 
drawing classes on a random basis, and it is possible that the reason for the obtained results is 
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associated more with the criteria which the students themselves used in opting for the technical 
drawing course, rather than with the effect of the specialized training. 

The most important implications of this experiment concern the production of diagrams to be 
used in the practical teaching situation for this area of subject matter with these students. Thus 
there would appear to be significant disadvantages in using diagrams which contain only 
elevation cues. However, there appear to be no educational advantages in using expensively 
produced photographs. Well-drawn line drawings, provided they are fully cued, have been 
shown to have no significantly different effect on the students’ understanding of spatial 


relationships. 
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Effects of high intensity white noise on short-term memory for position in 
a list and sequence 


Safar Daee and J. M. Wilding 





Seven experiments are described investigating the effect of high intensity white noise during the visual 
presentation of words on a number of short-term memory tasks. The findings were: 

1. In a free recall task recall of items decreased at the highest intensity used (85 dB) compared with a 
quiet and a 75 dB condition. 

2. In free recall, recall by category decreased and recall in the original sequence increased in the 75 dB 
compared with the other two conditions. 

3. Recall of the position of words in the list increased as noise intensity increased, but only when the 
learning of position was incidental, not when it was intentional. It is inferred that the effect is due to 
direction of attention or change in the learning strategy. 

4. Recall of the original sequence (as shown by the ability to give in response to a word from a list the 
word which had followed it in the original list) was superior in the 75 dB compared with the other two 
conditions, but only when recall of the second word was required, not when it had to be recognized among 
all the items from the original list. It is argued that this can be explained if noise intensity affects the 
strength of traces and hence the interconnexions established between them, on which retrieval depends. 

The results for position learning are compatible with the theories of Hockey & Hamilton (1970) or Dornic 
(1973), but the results for sequence learning cannot be explained by either of these theories. A final 
experiment confirmed a prediction from the above theory that when recalling the original sequence, 
omissions (recalling no word) will decrease and transpositions (giving the wrong word) will increase as 
noise level increases. 


nn eet 


The experiments to be reported were concerned with the effect of high intensity white noise 
(WN) on short-term memory. Some experiments have shown that high intensity WN during 
learning produces an impairment of short-term recall [McClean (1969), using paired associate 
learning (PAL) and comparing 85 dB WN with no noise; Berlyne et al. (1965), using PAL and 
comparing a range of intensities from 35 to 75 dB with no noise]. Other experiments, however, 
have found no effect or a beneficial effect of WN during learning on recall. Berlyne et al. (1965) 
found a slight improvement in their first experiment when comparing 58 dB with no noise, but a 
deterioration when the noise level was increased to 75 dB; Hérmann & Todt (1960) obtained a 
similar result: Berlyne, Borsa, Hamacher & Koenig (1966), using PAL found no difference in 
immediate recall between 75 dB WN and no noise during learning, and cite a number of German 
experiments using serial verbal learning where beneficial effects of WN on recall were reported. 
Haveman & Farley (1969) found no effect of WN on immediate recall in PAL or free recall and 
no effect on PAL recall 24 hours later. Hockey & Hamilton (1970), after a single presentation of 
a list, asked subjects to write items in spaces representing the position in the presentation and 
found no difference between an 85 dB and a 55 dB WN condition in the total number of items 
recalled irrespective of whether position in the list was correctly recalled. Hamilton, Hockey & 
Quinn (1972), using PAL, compared 85 dB and 55 dB WN and found no difference when the 
order of items was randomized in the test trial in the usual way. Archer & Margolin (1970) tested 
recognition of two-digit numbers and found that a 1 sec burst of WN at 100 dB before or after 
the presentation aided subsequent recognition. 

It is commonly assumed that WN affects arousal and some physiological evidence supports 
this assumption (Magoun, 1963; Davis, 1948; Berlyne & Lewis, 1963) but doubts about the 
unitary nature of arousal and its behavioural consequences must induce caution in interpreting 
the data. Uehling (1972), Craik & Blankstein (1975) and Eysenck (1976) reviewed the evidence 
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on arousal and learning; they agree that there is good support for a positive relation between 
arousal at learning and long-term recall, and that the results for short-term recall are confused, 
differing for different experimental paradigms. Results using WN as an arouser agree with the 
finding that arousal at learning facilitates long-term recall (Berlyne et al. 1965; Berlyne & Carey, 
1968; Haveman & Farley, 1969, in a free recall but not in a PAL task; McClean, 1969). 

The precise effects of WN during learning on short-term recall are, therefore, as yet unclear. 
There is no clear-cut relation apparent between noise intensity and the result, and there is some 
suggestion in the data that different experimental paradigms may produce different results. There 
is, however, ample evidence available to refute Broadbent's (1971) claim that white noise only 
affects performance when it reaches an intensity over 90 dB. 

There is also some evidence that WN affects learning qualitatively by increasing reliance on 
order information and some evidence that other stressors may have similar effects. Hockey & 
Hamilton (1970) found that, while the noise level did not affect the number of items recalled 
when correct position was not taken into account (see above), the percentage of items recalled in 
the correct position was higher under 80 dB WN (P= 0-055). Hamilton et al. (1972) 
found that 85 dB WN produced better performance in their PAL task than 55 dB when the order 
of the pairs remained constant from trial to trial. They explain the results by reference to 
evidence obtained by Hockey (1970a, b, c) that when arousal is high attention is narrowed to 
sources given high priority. A similar suggestion had earlier been made by Easterbrook (1959) 
who suggested that the range of cues utilized is reduced under high arousal; he assumed a 
narrowing of the visual field rather than a redeployment of attention. Hockey & Hamilton (1970) 
suggest that with high arousal, subjects take in less information but attend to it more and 
Hamilton et al. (1972) further suggest that the extra processing capacity allocated to the main 
task is used to preserve order information. They imply that the extra capacity may be used to 
process any relevant cue and is not allocated specifically to order information. 

Thus Easterbrook suggested that less information is processed when arousal increases, while 
Hockey & Hamilton suggest that the same amount is processed but what is processed changes. 
In neither case is an explanation offered of how increased arousal narrows attention. 
Easterbrook states that he proposes no explanation but refers to Hull's ( 1943) idea that increased 
drive increases the relative strength of dominant responses. Hockey (1970 c) refers to Spence's 
(1956) version of this theory as ‘an adequate description of most of the data’. However, while 
Easterbrook's views can be explained in the Hullian manner, Hockey & Hamilton's 
redeployment hypothesis clearly cannot. An increase in the strengh of dominant responses could 
not produce increase of a previously non-dominant tendency to process information about order. 

There are other grounds for questioning Hockey & Hamilton's theory. In their 1970 
experiment, spatial cues were associated with each item as well as order cues, but the former 
were learned less well and the latter better with the higher noise level. The authors simply 
assume that spatial cues were irrelevant and order cues relevant to the task, but the only 
justification was that each spatial position was associated with two items and each position in the 
list with only one item. Despite the ambiguity of spatial position it would have proved a useful 
auxiliary cue and should therefore have been processed better at the higher intensity of noise. 
There is as yet no other evidence whether or not the effect of noise is specific to order cues or 
can affect any auxiliary cue, as Hockey & Hamilton assume; there are some indications that 
other stressors also affect the ability to recall order information (see below). 

À second problem is that, if increased arousal leads to a focusing of attention on the main 
task, with learning of any additional relevant cues, its effects will only show when there is 
competition from a subsidiary task or from irrelevant information which attracts attention 
(Hockey, 1970 a). Consequently increased arousal should have either no effect on a single task 
when learning is intentional and attention focused by instructions, or possibly a small beneficial 
effect if some incidental distraction is thereby excluded. Incidental learning of cues irrelevant to 


High intensity white noise and short-term ne 337 


the main task should be reduced and incidental learning of cues relevant to the main task | 
increased. McClean’s (1969) result could be explained in this way since the subjects were led to 
believe that the main task was other than learning the paired associate items and increased 
arousal should have increased the tendency to ignore the latter, giving worse recall. However, 
this explanation seems unlikely, as recall after a delay was better with high arousal at learning. 
Berlyne et al.’s (1965) results were obtained with an intentional learning situation, so present a 
problem for Hamilton & Hockey. Similarly, though some of the experiments using other 
methods of varying arousal did use incidental learning and found high arousal associated with 
worse short-term recall, others used intentional learning and obtained the same effect, and some 
of those using incidental learning also obtained better long-term recall with high arousal at 
learning. 

Dornic (1973) has suggested that noise acts like other methods of increasing task difficulty. He 
found that several such methods had little effect on the retention of lists of items in the correct 
order, but reduced the probability of recall when the order was not retained. He argues that 
increased difficulty induces regression to a more primitive ‘parrotting back’ form of learning. He 
does not state whether such a strategy can also be induced by instructions to learn in order. The 
lists in his experiments consisted of a mixture of letters and digits and increase in difficulty also 
tended to reduce grouping by category in recall. Dornic’s suggestion is reminiscent of the ideas 
of Craik & Lockhart (1972) about depth of processing. Also Schwartz (1975) has suggested that 
increased arousal facilitates recall based on physical rather than semantic properties of stimuli. 

The experiments to be described endeavoured to confirm the effects of WN on the learning of 
order information and to distinguish between a number of possible interpretations of this effect. 
In the results cited (apart from Hockey & Hamilton, 1970), learning of order was incidental to 
the main task of learning and reporting items and it is unclear whether the observed effects are 
due to differences in a retrieval strategy or a learning strategy or whether the effect is a more 
‘automatic’ effect on the processes involved in storing and retrieving information. 

M. W. Eysenck (1975) has argued that studies using WN during learning ‘may have reached 
erroneous conclusions as a result of ignoring the subject’s state of arousal at the time of the 
retention test’. He found that WN affected ability to retrieve information from semantic 
memory, but it does not follow that WN has no effect during learning. Moreover Berlyne et al. 
(1965) found that WN at recall had no effect on performance. Uehling & Sprinkle (1968) found 
that WN at recall improved performance in a serial learning task when recall was 24 hours or a 
week after the learning, but not when it was 3 min after, probably because of a ceiling effect in 
the latter case. Therefore the question of how WN affects memory in general remains unsettled; 
still less is known about how it affects memory for order information. 

A second point requiring clarification is how exactly order is learned. Recall of order may 
reflect either recall of position in the list or recall of sequential relations of items or both and 
recall of position in the list may depend on recall of relative distance from the end of the list or 
recall of ordinal position (Heslip & Epstein, 1969). It is not possible to disentangle recall of 
position and recall of sequence completely, but it can be done partially, by examining 
probability of recall in sequence when position is not recalled or recall of position when the 
preceding item is not available as a cue for sequence. 

No attempt is made in these experiments to test the view of Hamilton & Hockey that any 
useful retrieval cue may be processed under higher arousal since the main object has been to 
establish the existence and nature of the order phenomenon. Consequently the experiments do 
not provide and were not intended to provide a critical test of the difference between Hamilton 
& Hockey’s theory and that of Dornic. The experiments begin with a free recall task to discover 
the effect of WN on a number of aspects of performance; subsequent experiments required 
intentional learning of position and sequence information to determine more exactly how noise 
effects occurred. 
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Ideally experiments involving arousal should include controls for the time of day at which 
testing occurs, since arousal level and memory performance vary with the time of day (Blake, 
1967; Baddeley, Hatter, Scott & Snashall, 1970; Hockey, Davies & Gray, 1972). Such control 
did not prove feasible in the present experiments and subjects were assigned randomly to 
conditions when they became available. 


Experiment J 

Method 

Materials. A free recall task was used with 40 items taken from lists of nouns of varying length given in 
Tulving (1962) and Bousfield, Cohen & Whitmarsh (1958). Twenty of the words were unrelated, ten were 
animal names and ten were vegetable names. They were presented in random order on a memory drum, one 
every 2 sec. White noise was recorded on tape from a White Noise Generator (Dawe Instruments Type 419c) 
on a Uher 5000 tape-recorder (frequency range 40 to 16000 Hz) and played back through headphones (AKG 
Type K60, 600 Ohms). The noise levels were selected fairly arbitrarily to cover the range where effects had 
been previously observed. 


Subjects. Sixty subjects took part; as in all the experiments these were volunteer students from many 
different departments or applicants for entry to the college, who volunteered to take part in an experiment 
while visiting the department for an interview. 


Design. Subjects were assigned randomly to a quiet (headphones but no noise) condition, a 75 dBc or an 
85 dBc condition, 20 subjects to each condition. 


Procedure. Subjects were instructed that they would be shown a list of words in the window of the memory 
drum, to look carefully at them and afterwards to recall as many as they could in the order they occurred 
to them. Four minutes were allowed for recall, longer than any subject required. Recall was spoken and 
recorded on tape. 


Aim. The aim was to discover whether white noise during presentation affected recall efficiency and recall 
strategy in a free recall situation and in particular whether any changes would appear in recall by category 
and recall by order, as found by Dornic (1973). Dornic’s findings predict that recall of items will decline, 
recall by category will decline and recall in the original sequence will increase as noise level increases. 
Hamilton and Hockey make no such clear-cut predictions. Recall of items should be unimpaired as noise 
level increases or possibly improved; whether category or sequence information is used as an additional 
retrieval cue more frequently in noise depends on direction of attention. 


Results 


(i) Total recall was scored by counting the number of items correctly recalled. 

(ii) Recall by category was scored by calculating the average length of a sequence (cluster) of 
animals or vegetables (e.g. if a sequence of three animals were recalled, then one unrelated 
word, then two vegetables, the average cluster length is 2-5 items; unrelated words were not 
included in this measure). This measure takes no account of the possibility that an idiosyncratic 
organization is present which would show up as repetition of recall order on subsequent recall 
trials. a4 

(iii) Recall in sequence was scored by calculating the probability that a word in recall followed 
the word which had preceded it in the original presentation (e.g. if seven words were recalled 
and two followed the word that had preceded them in the presentation, the probability of recall 
in the original sequence is two out of six, since the first word recalled cannot follow another 
item, or 0-33). 

The means of the three scores are in Table 1. Separate one-way analyses of variance were 
carried out on each measure, with planned comparisons of the quiet vs. noise conditions and 
the 75 vs. 85 dB condition. Total recall was significantly affected by noise level (F= 3-82, 

d.f. =2, 57, P< 0-025) but only the comparison of 75 dB and 85 dB was significant (F = 4-86, 
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Table 1. Mean number of words recalled out of 40, mean cluster length and mean probability of 


recall in the original sequence at each noise intensity in Expt. I 
ii en 





Mean no. Mean cluster Mean probability 
Condition words recalled length of recall in sequence 
Quiet 14-40 1:88 0-045 
75 dB 14-05 1:53 0-093 
85 dB 11-80 1-74 0.029 





d.f. = 1, 57, P< 0:05). Cluster length was not significantly affected by noise level. The 
probability of recall in sequence was affected significantly by noise level (F= 6-91, d.f. =2, 57, 
P « 0-01) and again the planned comparison of 75 dB and 85 dB was significant (F = 12-78, 

d.f. 2 1, 57, P< 0-001). However, these probabilities were very low and frequently zero, so these 
last results should be treated with some caution. 

Similar analyses were carried out separately on items recalled from Primary Memory (defined 
as those with less than six items between presentation and recall) and those recalled from 
Secondary Memory (the rest). There were no significant effects of noise level on PM, only about 
one item being recalled throughout, and the results for SM paralleled those for all items taken 
together. (This was true for all subsequent experiments and separate analyses of items recalled 
from SM will not be discussed in future.) 


Discussion 


Total recall declined as noise intensity increased, though the effect is only marked at 85 dB; this 
supports the results of McClean (1969) and Berlyne et al. (1965) using paired associate learning. 
The other two measures both show a non-monotonic relation to noise level; there is a 
non-significant tendency for recall to be clustered less into categories and a significant tendency 
for recall to retain the original sequence more closely in the 75 dB condition compared with the 
others. This non-monotonic relation is a new finding, but not discrepant with the existing results 
on recall of order since Hockey & Hamilton (1970) used only two noise levels of 55 dB and 

80 dB and Hamilton et al. (1972) used 55 dB and 75 dB. 

The tendency for recall in categories to decline with 75 dB WN agrees with the findings 
reported by Dornic (1973) using other stressors and one finding quoted by him using 95 dB WN 
in a free recall task (Hórmann & Osterkamp, 1966). However, the present results suggest that 
recall by category only declined with a WN intensity of 75 dB and increased again at 85 dB, 
which is at variance with the latter result. There is no evidence in the present data that recall is 
less efficient when recall by sequence increases at 75 dB. 

The decline in total recall as noise intensity increased supports Dornic's prediction rather than 
that of Hamilton and Hockey. The non-monotonic relation between the other two measures 
and WN intensity was not predicted by either Dornic or Hamilton and Hockey. . 


Experiment II 


This was a partial replication of Expt. I with an additional control for personality differences. 
According to H. J. Eysenck (1960) extraverts are high and introverts low in autonomic reactivity 
and Broadbent (1971) argues that this dimension may be the same one that is affected by noise. 
If so, noise should affect introverts and extraverts differently. Since a medium level of arousal 
is generally optimal (Jones, 1960) increased noise intensity should facilitate extraverts' 
performance up to some intensity level then impair it at higher levels; introverts' performance 
should be impaired at lower levels. M. W. Eysenck (1974) found these effects when studying 
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retrieval from long-term memory. The effects on the other recall measures observed in Expt. I 
might also be expected to occur at lower WN intensities in introverts, though the exact 
combinations of personality and noise intensity would be critical and precise predictions difficult 
to make. Expt. II used only the quiet and 75 dB conditions of Expt. I; the main object was to 
replicate the previous result rather than to study personality differences in detail. 


Method 

Materials. Free recall of 40 words of varied length was again used, consisting of twenty unrelated nouns 
with a frequency of 1 per million (Thorndike & Lorge, 1944), ten parts of the human body and ten precious 
stones (Battig & Montague, 1969). 


Subjects. Eighty subjects took part. Half were classified as extraverts, with a score of more than 11 on the 
Eysenck Personality Inventory Form A (Eysenck & Eysenck, 1962), and half as introverts, with a score of 
less than 11. Subjects scoring exactly 11 were dropped from the experiment. 


Design and Procedure. Subjects were first given the EPI Form A and then assigned randomly to the quiet or 
75 dB condition, 40 extraverts and 40 introverts to each. Instructions and procedure were as in Expt. I. 


Table 2. Mean number of words recalled out of 40, mean cluster length and mean probability of 
recall in the original sequence by introverts and extraverts at each noise intensity in Expt. I 





Mean no. Mean cluster Mean probability 
Condition words recalled length of recall in sequence 
Introvert 15-00 3-35 0-20 
Quiet 
Extravert 16:45 3-36 0-45 
Introvert 14-35 2:53 0-55 
75 dB 
Extravert 16-90 2-98 0-75 
Results 


The results were scored as in Expt. I and are in Table 2. Extraverts recalled significantly more 
words than introverts (F= 5-98, d.f. = 1, 76, P< 0-025); noise intensity had no effect on total 
recall. There was a tendency for extraverts' recall to improve and introverts' to decline under 
75 dB WN, but the interaction was far from significant. On the other two measures there were 
no personality differences, but mean cluster length decreased significantly under 75 dB WN 
(F= 4-38, d.f. = 1, 76, P< 0-05); recall in the original sequence increased under 75 dB WN but 
not significantly. 

Discussion 

The superior total recall of extraverts agrees with previous results (see Uehling, 1972; Craik & 
Blankstein, 1975) and supports the superiority of short-term recall when arousal is lower. There 
was, however, no statistical support for differential effects of noise on extraverts and introverts 
in this experiment. 

The results on recall by category and recall in sequence show the same pattern as in Expt. I, 
but the pattern of significance has changed. The lists in the present experiment were somewhat 
easier, giving higher scores on all three measures. Again there was no evidence that the change 
in the type of recall affected the total number of items recalled. 

These two experiments, therefore, provided some but by no means conclusive support for an 
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effect of noise intensity on the nature of the learning or retrieval processes. Dornic’s (1973) 
suggestion that noise reduces the use of conceptual features and increases ‘ parrotting back’ 
received limited support, but the evidence of Expt. I that the effect of noise intensity is 
non-monotonic presents a difficulty. 

Since the type of recall in a free recall task is incidental to the task of recalling as many items 
as possible, it is not clear whether the effect of WN operates at learning or is due to a strategy 
operating at the time of recall. Experiments III and IV, therefore, tested retention of position in 
the list directly and sequence indirectly by requiring subjects to place items in their original 
position in the list. In Expt. III subjects did not know beforehand the type of recall that would 
be required, so the learning of position and sequence was incidental; in Expt. IV they were told 
beforehand what would be required so learning of position was intentional. Thus Expt. III tested 
whether the effect of WN was still present when the strategy at recall was restricted and Expt. 
IV tested whether it was still present when the strategy at learning and recall was restricted. 

It does not of course follow from the recall requirement to place items in position that 
subjects actually carry out either the recall or learning by doing just that. They may use position 
cues alone or sequence cues alone or both position and sequence cues. Consequently in scoring 
the results, recall of both position and sequence information was examined. 

Hockey and Hamilton would predict improved recall of position with increasing noise intensity 
in Expt. III if position is treated as a relevant additional cue; however, noise intensity should 
have little or no effect when position is to be learned intentionally since attention can be focused 
on this task by the instruction. Dornic would also predict better recall of position in Expt. III as 
noise intensity increases, if the learning of position is involved in the order effects he observed. 
His prediction for Expt. IV is also probably the same as Hamilton & Hockey's, but it is not 
clear whether he thinks that the higher memory process can be voluntarily suppressed in favour 
of the lower process which handles order, when following an instruction to learn position. From 
now on it will be assumed that he does accept this. 


Experiments III and IV 
Method 


Materials. Twenty unrelated words of varied length were used with a frequency of 47 per million (Thorndike 
& Lorge, 1944). They were presented on a memory drum as in the previous experiments. A tray with 20 
numbered compartments was used in the recall test, and 20 cards, each with one of the 20 words typed on it. 


Subjects. Sixty subjects served in Expt. III and 60 in Expt. IV (all but a few subjects served in only one of 
the two experiments). 


Design. Subjects were assigned randomly to a quiet, 75 dB or 85 dB condition, 20 to each condition in each 
experiment. 


Procedure. Subjects were instructed that they would be shown a list of words in the window of the memory 
drum, to look carefully at them and be prepared to recall them. In Expt. III after the words had been 
displayed a tray was uncovered with 20 slots numbered 1 to 20. The 20 words typed on cards were given to 
the subjects and they were instructed to sort each word into the correct slot for its position in the original 
list. Five minutes were allowed for this. Subjects were asked to put all the cards in slots, even if they were 
not sure about the correctness of the position. All cards placed in the slots remained visible, so sequence 
could be used as a cue. Correction of earlier placements was permitted. 

In Expt. IV the tray was shown and the instructions for recall were given before the words were shown on 
the memory drum. 
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Results 
Five scores were calculated: 

(i) Total number of words in the correct position. 

(ii) Total number of words following the same word as in the original list. 

(iii) Total number of words preceding the word they had followed in the original list (that is, 
recalled in the reverse sequence). 

(iv) The probability of a word in the correct position, given a preceding word in the wrong 
position (that is, position correct but sequence incorrect). 

(v) The probability of a word following the same word as in the original list, given a preceding 
word in the wrong position (that is sequence correct but position incorrect). 

Since some subjects served in both experiments, the data could not be combined for a single 
analysis. Analyses of variance were carried out on each measure in each experiment separately, 
and on the basis of the results of Expt. I planned comparisons were carried out to test firstly the 
difference between the results for the quiet and 85 dB conditions combined and the 75 dB 
results, and secondly the difference between the quiet results and the 85 dB results. The mean 
scores and the significance of these two planned comparisons are shown in Table 3a (Expt. III) 
and Table 3 b (Expt. IV). 


Table 3. Mean number of words out of 20 placed in the correct position, mean number of words 
placed in the original sequence, mean number of words placed in the reverse sequence, mean 
probability of placing in the correct position given that the preceding word was not in the correct 
position, and mean probability of placing in the correct sequence, given that the preceding word 
was not in the correct position, at each noise intensity, and significance levels of the planned 
comparisons between noise intensities in (a) Expt. III and (b) Expt. IV 





Mean no. Mean no. Mean no. Mean probability Mean probability 








words in words in words in of position of sequence 
correct correct reverse correct, sequence correct, position 
Condition position sequence sequence incorrect incorrect 
(a) Quiet 3-05 2:80 1:10 0-08 0-13 
75 dB 3-85 4-05 1-55 0-08 0-15 
85 dB 4.55 2:72 0-98 0-12 0-07 
Quiet--85 dB vs. 75 dB n.s. P« 0-05 P«0-05* n.8. P< 0-05 
Quiet vs. 85 dB P< 0-05 n.s. n.s. P< 0-05 n.8. 
(b) Quiet 4-85 4-45 1-10 0-12 0-14 
75 dB 4-95 4-10 1-15 0-12 0-10 
85 dB : 5-55 4-65 1-65 0-11 0-14 
Quiet--85 dB vs. 75 dB n.s. n.s. n.s. n.s. n.s. 
Quiet vs. 85 dB n.8. n.s. n.s. n.s. n.s. 
* One-tailed test. 
Discussion 


Memory for position improved as noise intensity increased in Expt. III, while memory for 
sequence was best at 75 dB WN. The same pattern of results occurred when items correct in 
position but not in sequence and items correct in sequence but not in position were examined. 
There was also a tendency to recall items more often in the reverse sequence to the original 
presentation in the 75 dB condition (P « 0-05, one-tailed test). The results therefore confirm that 
noise intensity affects memory for position and sequence, though not in exactly the same way. 
Experiment IV produced no significant effect of noise intensity on any measure. Compared 
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with Expt. III, memory for sequence improved in the quiet and 85 dB conditions so that 
performance at 75 dB was worse than in the other two conditions in Expt. IV. The same was 
true for recall in the correct sequence with position wrong. 

The direct test for memory about position in Expt. III shows that more information is stored 
about this under high intensity noise when the nature of the recall is unknown, but Expt. IV 
suggests that intentional learning of position is not affected by noise intensity; if so, the result of 
Expt. III is due to a strategy, such as the direction of attention or use of a lower order memory 
process, adopted during learning, which can be induced by instructions or high intensity noise. 
Either Hockey and Hamilton or Dornic could therefore predict this result. However, the critical 
test for a difference between the results of Expts. III and IV, in the form of an interaction 
between type of learning (incidental or intentional) and noise level, cannot be carried out for the 
reasons already given. Inspection of the data suggests that there was no marked difference 
between the two experiments in the effects of noise. A decision on whether memory for position 
is affected by noise level when learning is intentional must therefore await the evidence of Expt. 
V below. . 

The results on memory for sequence are more difficult to interpret. In both experiments recall 
of sequence was incidental to the main task of recalling position (though they are not 
independent) and the different effects of noise intensity may be due to this. As pointed out 
above, when recall by sequence is incidental to the main task of recalling items the observed 
differences may reflect retrieval or learning processes. It will be necessary to examine intentional 
recall and learning of sequence before firm conclusions can be drawn. This is done in Expt. VI 
below. The non-monotonic relation between noise intensity and recall of sequence from Expts. I 
and III is not predicted by either Hamilton and Hockey or Dornic. 

Experiment V was designed to check the finding from Expt. IV that intentional learning of 
position was not significantly affected by WN intensity. A further control was included. In Expt. 
IV sequence cues were also available from previously placed items when subjects were placing 
items in position; in Expt. V such cues were reduced by removing items when they had been 
placed in position. 


Experiment V 

Method 

Design and procedure. A list of 12 nouns five or six letters long was presented on the memory drum 
(frequency 47 per million). Pilot tests showed that a longer list produced very poor performance. Twenty 
subjects were assigned to each of the quiet, 75 dB and 85 dB conditions. They were instructed that a list of 
words would be presented on the memory drum, asked to look carefully at them and told that afterwards 
they would be asked to place the words in their position in the list. A tray with 12 compartments, numbered 
1 to 12, was used in the recall test. Items had to be posted through a slot and were not visible once they had 
been posted. In the test items 2 to 11 were given to the subject in random order on cards. Subjects were not 
prevented from posting more than one item through any slot and no feedback was given. 


Results 


The number of items placed in the correct position was counted. The means are shown in 
Table 4. There were no significant differences between the three conditions. 


Table 4. Mean number of items out of ten placed in the correct position at each noise intensity 
in Expt. V 


Condition Mean no. items 
Quiet 2-6 
75 dB 2:3 


85 dB 3-2 
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Discussion 


The result of Expt. IV, which showed no effects of noise intensity on the intentional learning of 
position, was therefore confirmed. Position learning was therefore only affected by noise 
intensity when it was incidental, which suggests that noise operates on the direction of attention 
in the way assumed by Hockey and Hamilton or by Dornic’s theory (if it is assumed that noise 
induced the lower order memory process in Expt. III and instructions to learn position induced it 
at all noise intensities in Expts. [V and V). Other experiments are needed to test between these 
two theories. 

WN intensity appeared in Expt. III to affect sequence learning differently from position 
learning. However, sequence learning was not directly tested, so the results may only reflect 
changes in the strategy used for recalling position. Incidental learning of sequence could be 
tested directly by requiring subjects to report which item followed a test item in the original 
sequence without informing them of the nature of the test beforehand. Intentional learning of 
sequence as in the next experiment can be tested by informing the subjects about the nature of 
this test beforehand. Furthermore, a comparison of recall and recognition performance will 
provide evidence on whether the effects of WN intensity lie in the learning or the retrieval 
process. 

Experiment VI was designed therefore to test recall of sequence directly without recall of 
position by asking subjects to respond to words drawn from a previously exposed list with the 
word which had followed in the original list. Subjects were instructed beforehand on the nature 
of the recall task so learning of sequence was intentional. In addition recall was compared with 
recognition to determine whether the effects of noise operated on learning or retrieval. Both 
Dornic and Hockey and Hamilton would predict at most a small beneficial effect when learning 
is intentional, since attention can be focused on sequence information by the instructions, and 
would predict no difference between recall and recognition, since they assume WN affects an 
aspect of the learning process. 


Experiment VI 
Method 
Material. A list of 12 nouns was presented on the memory drum. The words had a frequency of 50 per 


million and were five or six letters long. Two lists were constructed, one for the recognition and one for the 
recall condition. 


Subjects. Sixty subjects took part from the same population as in the previous experiments. 


Design. Subjects were assigned randomly to the quiet, 75 dB or 85 dB condition, 20 subjects to each 
condition, and carried out both the recall and recognition tasks in random order, learning a different list in 
each condition. 


Procedure. Subjects were instructed that a list of words would be presented in the window of the memory 
drum, to look carefully at the words and afterwards to be prepared to say which word had followed the one 
shown to them. In the recognition task all 12 of the words were available typed on cards and subjects were 
asked to select the one which had followed the word shown to them. They were told how the recognition 
task would be run before the words were shown on the memory drum. After all the words had been 
presented on the memory drum ten of the words (excluding the first and last) were shown in random order 
one at a time and subjects were asked to supply the word which had followed in the original list. No 
feedback was given about the correctness of responses. 


Results 


The mean number of correct responses is shown in Table 5. The recall and recognition 
conditions were first analysed separately. Planned comparisons were used to compare the quiet 
and 85 dB conditions combined with the 75 dB condition and the quiet condition with the 85 dB 
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Table 5. Mean number of items correctly recalled and recognized as having followed the ten 
probe items at each noise intensity in Expt. VI 








Condition Recall Recognition 
Quiet 1-65 2-45 
75 dB 2:80 2:30 
85 dB 1:55 1-75 





condition. In recall the 75 dB condition was significantly better than the other two (F=7-93, 
d.f. = 1, 57, P< 0-01), which did not differ significantly from each other. There was no 
significant effect of noise level in the recognition condition. When the recall and recognition 
conditions were analysed together, there was a significant interaction of noise level and task 
(F2 3:11, d.f. 22, 57, P< 0:05), due to the difference between the 75 dB condition and the 
others (F= 4-89, d.f. = 1, 57, P< 0-05). The main effects of noise level and task were not 
significant in the combined analysis. 


Discussion 


The findings of Expts. I and III of a non-monotonic effect of noise intensity on sequence 
learning were thus confirmed in the recall condition. The finding that this effect only occurred in 
the recall condition raises two questions. Firstly, why was such an effect absent in Expt. IV? 
Secondly, why in Expt. III, in a task similar to the recognition condition in Expt. VI, were 
significant effects of noise on sequence learning found? The present result, according to current 
views of the difference between recall and recognition (e.g. Kintsch, 1970; Wilkins, 1971) implies 
an effect of noise upon retrieval rather than learning, since no effect appeared in the recognition 
test. Yet Expt. III required no item retrieval and an effect of noise on recall of sequence 
information occurred. 

One indication in Expt. IV which may help to resolve the first question is the slightly worse 
recall of sequence in the 75 dB compared with the other conditions; it is possible that this was 
because the instruction that recall of position would be required interfered with a dominant 
tendency to learn sequence at this noise intensity, the existence of which is reflected in the 
results of Expts. I and III. In Expt. VI the instruction on the nature of recall which would be 
required matched this suggested tendency. 

Resolution of the apparent contradiction between the results of Expts. III and VI requires an 
adequate explanation of the results of Expt. VI. The distinction between storage and retrieval is 
not very clear or helpful for present purposes. Superior retrieval implies a difference in the 
organization of the input in the different conditions; the problem is to explain why such 
differences did not show in the recognition task. In the recall task possible words have to be 
retrieved which may have followed the test word, then the strongest candidate has to be 
selected. In the recognition task only the second process was required. Hence the recognition 
task should be easier if retrieval was the main source of difficulty in the recall task, but if 
selection were the main source of difficulty the recognition task would be easier than recall only 
if it reduced the number of possibilities between which selection was required. In Expt. VI the 
recognition task did not reduce the number of possibilities as all 12 words were always shown, 
so it could only prove helpful when retrieval of items was a problem. This seems to have been 
the case only in the quiet condition (and to a small degree in the 85 dB condition). Recognition 
was worse than recall in the 75 dB condition, which suggests that the selection problem was 
increased in the recognition task. 

It is suggested that increased noise or arousal level increases the duration of traces of the 
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stimuli. In the quiet condition traces are shortest lived and connections between the 

traces of succeeding items are rare and weak. Consequently it is difficult to retrieve the 
following item in response to a word and providing the possible words in the recognition task 
proves helpful. At an intermediate level of noise, traces are of optimal duration to 

establish a connection with the trace of the next item when it arrives, without becoming 
connected to traces of later items. The probability of retrieving the correct item therefore 
increases and adding other competing candidates in the recognition task causes interference. 
At still higher levels of noise, traces last longer, and more interconnections develop and 
therefore compete with each other. 

The pattern of results in Expt. VI is explicable in these terms, but in Expt. III when all items 
were also made available there was superior learning of sequence in the 75 dB condition. In 
Expt. III all twenty items were displayed and had to be placed in their correct positions. As 
items were placed the number of possibilities remaining decreased; also additional sequential 
cues were available from items already placed, unlike Expt. VI where only the preceding item 
was given in each case. These differences are probably sufficient to explain the difference in 
result. 

This hypothesis leads to the prediction that errors of omission will be frequent at low 
noise levels and decrease as noise intensity increases and errors of transposition will show 
the reverse trend. Experiment VII tested this by repeating the recall condition of Expt. VI. 


Experiment VII* 

Method 

The words used had a frequency of 50-100 per million. Half the subjects were given one set of 12 words and 
half a different set. The noise conditions were 65, 75 or 85 dB white noise. In the recall test the 
experimenter waited until the subject indicated that no word could be recalled. All other details were as 

in Expt. VI. 


Table 6. Mean number of items correctly recalled and three types of error in recalling 
words in sequence in Expt. VII 








Correctly 
Condition recalled Omissions Transpositions Intrusions 
65 dB 1-35 6-05 2-35 0-25 
75 dB 1-90 3-80 4-00 0-30 
85 dB 0-95 3-80 4-75 0-50 





Results 


Mean numbers of correct responses, omissions, transpositions and intrusions are shown in 
Table 6. Intrusions were too rare to analyse. One-way analyses were carried out on the 
other scores, with the same planned comparisons as before (the comparison of the 65 and 
85 dB conditions tests linear trend). For correct responses the 75 dB condition was 
significantly better than the other two (F= 5-833, d.f. = 1, 57, P< 0-025), which did 

not differ significantly from each other. Omissions decreased significantly as noise level 
increased (for linear trend F 10-11, d.f. = 1, 57, P< 0-01); the first comparison yielded 
no significant difference. Transpositions increased significantly as noise increased 

(F= 14-90, d.f. = 1, 57, P< 0-001); the first comparison yielded no significant 

difference. The conclusions were confirmed when omissions and transpositions were scored 
as proportions of all errors and reanalysed, and when the means were tested against 
variation between words. 


* We thank Linda Podemsky and Ann Jones for making available data from a pilot experiment. 
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General discussion 


The prediction was confirmed and the hypothesis is supported, but clearly other explanations 
are possible. For example, subjects may try to rehearse more in noise, producing improved 
performance up to a noise level intense enough to disrupt rehearsal. One way of testing 

this would be to investigate whether other arousing stimuli have similar effects to noise. 

A further difficulty with the explanation being offered is that it appears to predict superior 
recall in a free recall task as noise intensity increases, yet the reverse was found in Expt. I. 
However, the development of more interconnexions does not entail that more items are learned 
or that retrieval of them is easier, only that the structure which is learned is different. Expts. I 
and II showed greater reliance on sequential associations and less reliance on categorical 
associations as noise intensity increased to 75 dB. Since organization in terms of semantic 
features generally improves free recall, superior performance in the quiet condition is to be 
expected. This explanation cannot, however, be applied straightforwardly to the results of the 
85 dB condition, since in Expt. I recall by category increased slightly and recall by sequence 
decreased slightly compared with the 75 dB condition. A possible explanation is that in this 
condition multiple associations are established to a word some of which will be from the same 
category; in recall such categorically related associates would tend to be selected as the next 
word. In the 75 dB condition, however, only the next word from the original list tends to be 
associated with each word recalled, so recall is primarily by sequence. 

This explanation firmly assigns the effects of WN to the learning stage. A pervading problem 
in experiments examining the relation between arousal at learning and recall performance has 
been to disentangle arousal at learning from arousal at recall. The effects of WN could carry 
over to recall when short-term recall is tested as in these experiments. The question of whether 
the observed effects are due to differences at recall cannot easily be answered by ensuring all 
groups are equal in arousal shortly after experiencing different noise intensities and has to be 
approached by examining long-term recall (though differences in arousal may be re-established at 
recall by association and further changes in organization may occur during the delay) or by 
varying the level of WN during recall to see if the same effects are found as when WN varies 
during learning. 

In Expts. I and II recall by category declined when recall by sequence increased but it cannot 
be concluded that organization by category is not available when organization by sequence 
occurs. In these experiments recall by category necessarily reduced recall by sequence and vice 
versa, and we do not know whether, when recall by sequence increased, ability to recall by 
category was no longer available had it been required. Further experiments are needed to 
discover whether high intensity noise actually prevents certain cognitive operations on the input, 
as Easterbrook, Dornic and Schwartz (1975) believe, or whether it only produces an additional 
and preferred method of organizing recall, as Hockey and Hamilton suggest. Dornic (1973) 
observes that recall was poor in more difficult tasks if order was not retained, which implies that 
no other method of retrieval was available. Schwartz (1975) also provides evidence that semantic 
information is ignored under high arousal. 

Questions also arise over long-term recall, particularly in view of the finding that high intensity 
WN during learning improves long-term recall (Berlyne et al. 1965; McClean, 1969). This could 
be explained by greater durability of the traces or by the development of more complex 
organization of the input over time. 

Since sequence learning was affected by WN intensity when it was intentional, Hamilton and 
Hockey cannot explain the result by direction of attention, nor can Dornic explain it, granted the 
assumption that his lower order memory process can be induced by instructions to learn 
sequence. Also neither theory can explain the non-monotonic effect of noise intensity. 

In contrast, position learning was affected by WN intensity only when learning was incidental 
and this result can be explained either by changes in attention (Hamilton and Hockey) or by 
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change in the memory process (Dornic), induced by high noise intensity or by instructions. Thus, 
as was stated initially, the data provide no evidence relevant to a decision between these two 
theories. 

The data also have little bearing on the suggestion of Easterbrook (1959) and Hockey (1970 c) 
that increased arousal increases the relative strength of dominant responses, since the total set 
of possible responses and their relative dominance, and the effect of instructions on the latter, 
are unknown. There are, however, some features of the data which raise difficulties for the 
dominance theory. The increase in incidental learning of position as noise intensity increased in 
Expt. III would imply that this was a dominant response, but the absence of any effect of noise 
intensity on intentional learning of position in Expt. IV does not support this conclusion. Nor 
does the hypothesis readily explain the non-monotonic effect of noise intensity on sequence 
learning. 

Finally the results help to explain the inconsistencies in the available data discussed in the 
Introduction. Sequence learning is more important in some learning tasks such as serial learning 
than others such as free recall and of intermediate importance in PAL. Berlyne et al. (1966) cite 
several serial learning experiments in which noise aided learning as would be expected from the 


present results. Results from PAL experiments vary. Furthermore the non-monotonic relation 
between noise intensity and recall of sequence further explains why varying results have been 


obtained. 
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A criticism of Wilkins’ (1971) measure of category size, and its implication 
for the Smith, Shoben & Rips (1974) model of semantic memory 


Peter E. Morris 


Smith, Shoben & Rips (1974) proposed a model for information storage and decision making on 
the semantic properties of words. This model provides a more adequate account of the nature 
and latencies of such decisions than any other previous theory. However, there is one result in 
the literature which appears incompatible with their theory and which they were unable to 
conclusively fault. This is the finding by Wilkins (1971) that it takes longer to make a judgement 
that an item is an instance of a category, the larger the size of the category. Smith et al. (1974) 
suggested that Wilkins’ measure of category size is inappropriate, but were unable to support 
their claim with quantifiable evidence. Such evidence is, however, available and is worth 
reporting since Wilkins’ (1971) paper is one of the most frequently quoted of those in the area of 
semantic memory. 

Wilkins measured category size in terms of the number of items listed as having been 
produced in each category in the Cohen, Bousfield & Whitmarsh (1957) category norms. These 
norms were derived by asking 400 subjects to list four items under each category heading. The 
question is, does the range of items so produced bear any relationship to the true size of the 
category, measured in terms of the items that would be classified or classifiable as falling into 
the category by ordinary people? This question is relevant not only to the validity of Wilkins’ 
study but also more generally, since much research on semantic memory has concerned the 
relationship of category size and latency of retrieval of items from the category (see 
Johnson-Laird, 1974; Smith et al. 1974, for reviews). Efforts at estimating category size have 
involved counting the number of items given by subjects in a 60 second period (Freedman & 
Loftus, 1971) while other studies have attempted to manipulate category size by using as small 
categories those which are subsets of larger categories (e.g. mammal and animal). The former 
method raises problems of item accessibility while the latter has the confounded variable of the 
greater abstractness of the larger categories. 

Was Wilkins’ (1971) method an appropriate measure? A straightforward alternative method of 
estimating category size is to ask subjects how big the categories are. These reports can then be 
compared with the estimates obtained by Wilkins’ method. A group of subjects can be asked to 
rate a set of categories on their size. Such subjective ratings in other areas of research have 
provided very useful and reliable estimates (e.g. the ratings of Imageability and Concreteness; 
Paivio, Yuille & Madigan, 1968; Morris & Reid, 1972). Ratings on the size of the categories of 
Cohen et al. (1957) were obtained by Battig & Montague (1969) when they were generating their 
category norms. It is therefore possible to examine the rated category size of those categories 
that were selected by Wilkins as being small and large. The result is surprising. The ratings were 
made on a seven-point scale, from 1 for very small to 7 for very large. The mean rating for 
Wilkins’ large categories is 4-83 (range 4-12-5-62), while for his small categories it is 5-29 (range 
4-59-5-83). It is clear that the rated size of the categories is opposite to that designated by 
Wilkins. What is more, the difference is actually significant at the 5 per cent level on a one-tailed 
t test (t= 1-989, d.f. = 14) even though only two sets of eight categories are involved. It seems 
that, given that the subjects can be trusted to know what they are talking about, Wilkins actually 
selected small categories when large ones were intended, and vice versa. 

Does this mean that when subjects are asked to give just a few examples from a large 
category that they are more consistent with one another in the example that they give than when 
they give examples from a small category? This would be an interesting finding, but the answer 
seems to be no. If the size of the 43 categories of Cohen et al. as measured by Wilkins, is 
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correlated with their size as rated by the subjects of Battig & Montague, the result is a product 
moment correlation of —0-03, d.f. = 41. If the subjective ratings are accepted, then not only was 
Wilkins wrong in his initial assumption, but he was also unfortunate in his actual choice of 
categories. In reality, there is no relationship between the range of the first four items given by 
subjects and the actual size of the category. Other factors than category size must determine the 
degree of agreement among subjects on the first items of a category that come to mind. 

It seems that there are good grounds for ignoring the evidence from Wilkins’ study against the 
Smith et al. (1974) model since the variable supposedly manipulated may not have been 
adequately controlled. The problem of an appropriate measure of category size remains, but the 
use of subjective ratings should be seriously considered for further research. The predictive 
power of the alternative measures could be examined, and comparison of rating and associative 
measures may help to clarify the concept of category size. The final decision on what, if any, is 
an appropriate measure of subjective category size will depend upon the theoretical assumptions 
accepted by the investigator. The validity of subjective ratings of category size might be 
questioned, or it could be argued that both the subjective ratings and Wilkins’ measure provide 
different, but not necessarily invalid measures of a complex concept. However, unless or until 
evidence of the invalidity of subjective ratings of category size is put forward, it seems 
reasonable to assume that the subjects know what they are doing and that their ratings provide 
the most direct estimate of subjective category size. Finally, while one of the results of Wilkins’ 
(1971) studies has been seriously questioned, its other main finding, that responses are quicker to 
items more frequently given in category norms, remains. However, this result is better 
accounted for by the Smith et al. (1974) model than by other models. 
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The structure of visual and kinaesthetic imagery: A free association study 
Susan Aylwin 





A free association technique was used to investigate the semantic structure of three forms of encoding: 
verbal, visual imagery, and kinaesthetic imagery. Kinaesthetic imagery involves imagined bodily 
identification with the stimuli (animal names) and is included because of its possible involvement in 
creativity, and in view of the importance of enactive representation in cognitive development. The analysis 
of evoked associations in terms of the propositional relations they bear to the stimuli, shows that the 
actor-action-object framework is particularly important in kinaesthetic imagery, and the whole-part structure 
in visual imagery. Verbal representation gives rise to various abstract and phonemically based association 
types. The relevance of these findings to creativity and to the concept of semantic memory is discussed. 





Much of the recent work on adult coding systems has concentrated on language and visual 
imagery. However, developmental studies suggest that a third coding system may also be 
available to adults, which derives from the ‘enactive’ (Bruner, 1966) or ‘sensorimotor’ (Piaget, 
1954) representations of infancy, and in which the adult uses covert action as a form of 
representation. Such representation is here called kinaesthetic imagery. 


Properties of kinaesthetic imagery 


An early description of kinaesthetic imagery comes from Francis Galton who identified three 
main types of associative responses to an association task: (i) verbal, (ii) visual imagery, and (iii) 
kinaesthetic imagery (which he called ‘histrionic’) — these last being most rapid and taking 
precedence wherever they occurred. Galton describes kinaesthetic imagery as being ‘where I am 
both spectator and all the actors at once, in an imaginary mental theatre. Thus I feel a nascent 
sense of some muscular action while I simultaneously witness a puppet of my brain — a part of 
myself - perform that action, and I assume a mental attitude appropriate to the occasion’ (1883, 
p. 198). 

There is some evidence that this form of representation is important in creative thinking 
(Walkup, 1965). Thus S. T. Coleridge (in Gerard, 1952) claimed a capacity to feel himself 
identified with objects of interest; both Hadamard (1949) and Einstein (1949) claimed to at least 
partially abandon the specialist symbolism ofmathematics in favour of motor and muscular 
symbols; and Gordon (1961) found that imagined bodily identification with the problem was 
likely to lead to creative solutions in group problem solving situations. Indirect support for the 
connexion between creativity and kinaesthesis can be found in the Rorschach literature which 
relates both concepts to the apprehension of movement in inkblots (Rorschach, 1942; Anderson 
& Munroe, 1948; Rawls & Slack, 1968; Steele & Kahn, 1969). 


Semantic structure of forms of representation 


Kinaesthetic imagery, visual imagery and language have all, at one time or another, been 
discussed within the context of meaning or semantics. 
In the Motor Theory of Consciousness Titchener (1909) and Washburn (1916) maintained that 
kinaesthesis was the necessary and sufficient condition of thought, and the origin of meaning. 
Following Pompi & Lachman (1967), Paivio and his co-workers (Begg & Paivio, 1969; Yuille & 
Paivio, 1969; Begg, 1971; Paivio, 1971) suggested that the representational meaning of concrete 
sentences may be encoded in terms of visual imagery. This view however lacks power as a 
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semantic theory since for Paivio the image is defined operationally, through stimulus 
concreteness, and is not seen as having articulable structure. 

In contrast to these relatively unsophisticated imagistic accounts, the recent semantic theories 
originating within linguistics show a high degree of articulation, employing ‘propositions’ as the 
basic semantic unit. Fillmore’s (1968, 1971) interpretation of propositions as consisting of a 
relation which takes deep structure cases as arguments, has been widely used in models of 
semantic memory (Kintsch, 1972; Rumelhart, Lindsay & Norman, 1972). While such models 
view semantic memory primarily as a repository of linguistic knowledge, there are indications 
that visual and enactive representations can also be described in propositional terms (Bower, 
1972; Collins & Quillian, 1972; Newell & Simon, 1972; Pylyshyn, 1973). If this is the case, it 
would expand the concept of semantic memory from being a purely linguistic store (Tulving, 
1972) to being a network of interrelated concepts which is derived from, and which may 
generate, both linguistic and non-linguistic representations. 


Free association and forms of representation 


Tulving (1972) views free association studies (e.g. those of Deese, 1965) as tapping intraverbal 
connections in semantic memory. However, if one postulates a common generative propositional 
substratum for both linguistic and imagistic representations, one can hypothesize that, through 
instructing subjects to use visual or kinaesthetic imagery to represent the stimuli, these forms of 
representation too can be characterized through the associative output. Work along these lines 
by Karwoski, Gramlich & Arnott (1944), Otto (1962) and Dominowski & Gadlin (1968), 
manipulated stimulus attributes (pictures versus words) rather than instructional set, and found 
little difference between different stimulus types. Davis (1932) looking at subject rather than 
stimulus variables, found that when associating to the word ‘childhood’, subjects who 

reported visualizing gave a higher proportion of nouns than did non-visualizing subjects. 

The current experiment seeks to extend these rather meagre findings by instructing subjects to 
represent the stimulus items (animal names) using rote (subvocal repetition), visual imagery 
(picturing the animal), and kinaesthetic imagery (imagining themselves to be the animal). The 
associations evoked in response to these forms of representation are then characterized through 
the types of propositional relation they bear to the stimuli. 

The experiment has a further subsidiary purpose: to gain evidence bearing on the hypotheses 
of Bugelski (1971) and Paivio (1971) that clustering may have an imagistic basis. Since subjects 
give three temporally separated associations to each stimulus item, the design allows a 
comparison of the extent to which triplets of associates are clustered together in recall in 
the three conditions. 


Method 

Subjects 

The subjects were 36 undergraduate volunteers from the University of Newcastle upon Tyne, 18 men and 18 
women. 


Materials 

Fifteen two-syllable animal names [Thorndike & Lorge (1944) frequencies 1-10 per million] were arranged in 
three lists each containing one small and two large mammals, one large bird and one cold-blooded creature. 
Each subject received all three lists, one under each of the three instructional sets. Within each combination 
of list and set the list was presented three times, with the subject therefore giving three associates to each 
name. Successive presentations used different random orders of the list items. 


Design 


The experiment manipulated four factors, two of which involved repeated measures. Each subject received 
all three lists and all three instructional sets in combinations determined by the latin square design. 
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Variables were: within subjects — (i) instructional set (rote, visual, kinaesthetic), (ii) order of presentation of 
instructional set (first, second, third); and between subjects (iii) sex, and (iv) ‘nouns’ versus ‘free’ 
associates - half the subjects were asked to give nouns as associations, the others were left free to provide 
any type of response. The three lists used as stimuli were counterbalanced across other factors. 


Procedure 


Subjects were tested individually in a dimly lit room. In an initial briefing they were told that the experiment 
was like a free association game in which they were to produce associations to animal names in a number of 
ways - by concentrating on the given word, imaging the animal, or pretending to be the animal - and then 
saying the first thing that came to mind. They were told that specific instructions as to the method to be used 
would be given at the appropriate time. Subjects were asked to try and give different associations on each 
occasion, and the noun group were asked to try to give nouns as associates. Subjects were warned that they 
would be asked to recall their responses. Specific instructions were given before each list: 

Rote. ‘Say the word over to yourself and then say whatever comes into your head as an associate. Please 
try not to use images here.’ 

Visual Imagery. ‘Image the animal first, then when you have done that, say whatever comes into your 
head as an associate.’ 

Kinaesthetic Imagery. 'Pretend to be the animal, and then when you have done that, say whatever comes 
into your head as an associate.’ 

Subjects were told that it was very important to follow the instructions given, and asked to try their best 
even if they found it somewhat strange. They were not subject to time pressure, though a crude measure of 
associative reaction time was taken by noting total time for the subject to respond to each list. 

Following specific instructions as to the form of representation to be used, one of the lists was presented 
verbally by the experimenter, who noted down the subject's responses. After one association to each of the 
five animals in the list had been elicited, the same list was repeated twice more with no break between 
presentations, so that the subject produced a total of three associations to each animal, 15 associations in all. 
Subjects were then given 2 min to write down all their associations to that list. They were asked not to give 
stimulus words and to write their associations in any order. A similar procedure was then followed with the 
other two instructional sets. 


Results 
Recall 


An analysis of variance showed no significant differences between conditions in the number of 
associations recalled, but the recall levels were generally so high that any possible differences 
were probably obscured by ceiling effects. Mean recall levels were: rote, 12-2; visual imagery, 
13-1; kinaesthetic imagery, 13-3. 


Association times 


The mean times in seconds for subjects to produce the 15 associations in each condition were: 
rote, 147 sec; visual imagery, 103 sec; kinaesthetic imagery, 141 sec. An analysis of variance 
showed the differences not to be significant. 


Clustering 


The extent to which triplets of associates were clustered together in recall was assessed using 
the Bousfield & Bousfield (1966) clustering measure (see Table 1). The noun and free groups 
were combined and the data subjected to a three-way analysis of variance (from Winer, 1970, 
pp. 554ff). Instructional set was significant only at the suggestive P< 0-1 level (F= 2-638, 

d.f. =2, 60) with visual imagery showing most and rote least clustering. Order of presentation 
proved highly significant (F= 9-593, d.f. =2, 60, P< 0-001) with clustering increasing from trial 
position 1 to 3. F tests were carried out on separate conditions to assess improvement across 
trials, showing: rote, not significant; visual imagery, P< 0-025; and kinaesthetic imagery, 
P<0-01. These figures do not represent straightforward practice effects since values for 
successive trials within a condition were derived from different groups of subjects. However, the 
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Table 1. Mean clustering values (12 subjects per cell) 








Order of 

presentation Rote Visual Kinaesthetic 
First 3-27 3-14 2-53 
Second 3-59 5-53 495 

Third 3-89 5.34 5:56 

Total 10-75 14-01 13-14 





results suggest that if subjects became aware of clustering as a possible recall strategy, they 
found it more easy to put into practice in the two imagery conditions than in rote - a finding 
which would support suggestions by Bugelski (1971) and Paivio (1971) that clustering may have 
an imagistic basis. 


Analysis of association types 


The schema used in classifying the associations is given below. The categories were derived 
partly from an examination of the data, and partly from extensive pilot experiments using similar 
tasks. 


A. Environmental/objects in the environment 

(1) name of country. 

(2) type of scenery/environment, e.g. grassland, mountains, zoo. 

(3) object found with, and with which animal might interact, e.g. associate lamp-post to stimulus 
bulldog. 

(4) person or creature associated with, and with whom animal might interact. 


B. Abstract. Superordinates and cohyponyms. 


C. Related to animal 

(1) attributes: colours, textures, sizes, shapes. 

(2) part of body. 

(3) evaluative, related to attitudes, e.g. crafty, sly, wild. 


D. Actions. E.g. pull, sniff, walking. Also verb phrases. 
E. Verbal. Phonetic, clang, and foreign language associations. 


F. Unclassifiable under A to E. 

While the author performed the major task of classification, a check on reliability was made by 
getting a second judge to use the above schema on 90 randomly selected associations. Interrater 
agreement was 90-0 per cent. 

Table 2 shows the results of the classification, and the outcome of Wilcoxon signed-rank tests 
(two-tailed) on pairs of conditions. The free group data show the most pronounced differences, 
though these are generally backed up by trends in the noun group. 

The rote condition shows significantly more verbal associations (E) and countries (A1) and a 
tendency to produce more superordinates and cophyponyms (B) than other conditions. 
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Table 2. Mean percentage of associates in different categories, and results of Wilcoxon 
signed-ranks tests (two-tailed) on pairs of conditions 


Noun group (18 subjects) Free group (18 subjects) 
Kinaes- 

Rote Visual thetic ^ Significant Kinaes- Significant 
Category (R) (V) (K) comparisons Rote Visual thetic comparisons 
Al 23 0-4 0 6:3 1-5 0:4 R>K**,R>V** 
A2 15-6 13-4 14-8 8.5 9:3 8:5 
A3 21-5 23-0 30-4 11-9 78 28-1 K»V**, K>R* 
A4 33 52 10-7 K>R** 3:3 41 7-0 
B 67 22 4-1 R»V* 7-0 6:7 3.0 
C1 74 9:3 3-7 11-5 11-9 9:3 
C2 28.1 36-3 23-3 V>kK** 13-0 32-2 9-6 V>R**, V>K** 
C3 1-1 1-5 1:9 78 11-5 9-2 
D 0-7 0-7 2:6 33 8-5 20-4 K>R**, K> V* 
E 6:3 1-9 1:5 16-7 1:9 0-4 R>V**,R>K** 
F 7-0 6:3 67 10:7 4-8 4.1 


* P«0-05; ** P<0-01. 


Visual imagery shows more of all type C associations (relating to animal) than rote or 
kinaesthetic imagery, the difference being highly significant for C2 - parts. 

Kinaesthetic imagery shows significantly more A3 (objects with which animal might interact), 
and D (actions) than either of the other two conditions. 


Sex differences 


Significant differences between men and women were found only in the free group, rote 
condition, where Mann-Whitney U tests showed that (i) for A3 (objects) men produce more than 
women (P< 0-05) with respectively 18-5 and 5-2 per cent of associations falling into this 
category ; and (ii) for all C types taken together (attributes, parts, attitudes) women produce more 
than men: 48-1 and 16-2 per cent respectively (P « 0-05). 

Discussion 

While association tasks utilize linguistic input and output, this does not preclude the association 
being mediated through some non-verbal representational process; and for the concrete nouns 
used in the experiment it can be assumed that the pattern of relations between the stimuli and 
subjects' responses reflects a pattern of relations between their symbolized referents, e.g. 
between the different parts of a visual image. The relations observed in the data can thus be 
taken to characterize the types of representation concerned, not just a population of words. 


Such an assumption is more parsimonious than its intraverbal alternative which cannot easily 
explain how the verbal samples arising from the three sets of instructions could be different. 


Semantic relations 


Rote. The major distinguishing feature of the rote condition is a relatively high percentage of 
‘verbal’ associates, based on sound pattern, foreign language associations and phrase 
completions. Also names of countries, though not forming a high percentage of the total 
output, appear more frequently here than in the other conditions. These are relatively abstract 
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associates — countries (A1), unlike environments (A2), are on a scale too large relative to 

the animal to be simply imaged together. The other relatively abstract association types, 
superordinates and cohyponyms (B), also appear more often here than in other conditions, 
though the differences are significant only for the noun group. These relatively abstract 
association types are presumably those which may be encoded and generated without imagery. 
Since the majority of subjects reported difficulties in excluding imagery in this condition, it is 
probable that the other association types involved non-linguistic mediators. 


Visual imagery. Most numerous among the visual imagery associates, and constituting about 
one-third of the total, are references to parts (C2), suggesting that the subject first visualizes the 
animal concerned and then uses the whole-part structure to focus down on one of its parts. 
These associations can be described propositionally in terms of inalienable have relationships: 
peacocks have feathers, bulldogs have teeth. It is of interest that inalienable having, which 
relates together two different logical levels - whole and part - is a type of relation which linguists 
have found particularly problematic. Relationships of having do not fit happily in Fillmore's case 
structure analysis, and Lyons (1968) maintains that fo have is a mere dummy verb with no reality 
at the (linguistic) deep structure level. Bower (1972) on the other hand hypothesizes that it is just 
such structures that may be among the basic semantic relations of the visual system. 

It is possible that the hierarchically structured semantic trees described by Katz & Fodor 
(1963) and Collins & Quillian (1972) could be generated from whole-part structures, since 
one of the main criteria for grouping objects into classes is possession of similar parts. That 
hierarchically ordered semantic trees are derived from other structures is suggested by 
Anglin's (1970) work, which shows such structures to be absent in children younger than about 
seven years. 


Kinaesthetic imagery. Whereas the visual representation is organized mainly in terms of relations 
spanning different logical levels, with the concentric whole-part relation particularly important, 
the kinaesthetic representation is organized largely in terms of the linear actor-action-object 
framework. The two largest categories of associations were A3 — objects in the environment, and 
D - actions [e.g. sledges (A3), or pulling sledges (D) to reindeer]. In view of Osgood's (1971) 
suggestion that the subject-verb—object structure of language is rooted in the actor-action-object 
structure of our dealings with the real world, it is significant that this structure should be 
exhibited in kinaesthetic imagery, which is probably the closest available parallel in adults to the 
enactive representation of children. 

The addition of kinaesthetic imagery to the more usually studied visual and verbal coding is a 
valuable aid in characterizing visual imagery. ‘Characterizations’ typically require ‘contrasts’, 
and it may be that the relative neglect of visual imagery structure has been a result of the 
absence of any appropriately contrasting form of representation. 


Clustering and problem solving 


The structural analysis, together with the tentative results from the clustering analysis, suggest a 
possible explanation for the differential efficacy of the different forms of representation in 
creative problem solving. A number of authors (see Ghiselin, 1952; Koestler, 1964) maintain that 
creativity involves the connecting together of previously unrelated ideas. This connectivity 
would appear to require a form of encoding which is: (i) spatial-parallel rather than linear- 
sequential - permitting the representation of several items simultaneously; and (ii) synthetic 
rather than (or in addition to) analytic - with the entities represented being available for external 
relations rather than revealing only their internal structure. Verbal representation, being linear 

: and sequential, fails on the first count, resulting in its relatively poor performance in the low 
level connectivity required for clustering. Furthermore, its tendency to evoke phonemically 


a 
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rather than semantically related associates ill befits it for the semantic connectivity required in 
creative thought. Visual imagery has spatial-parallel properties, facilitating clustering, but 
tends to reveal its structure analytically — at least in the association task used ~ with subjects 
frequently finding their associations by focusing on some part contained within the 
representation of the whole. Kinaesthetic representation, insofar as subjects report it to include 
some trace of visual imagery, appears also to have spatial-parallel properties, but the additional 
kinaesthetic or histrionic component permits associations to be found in objects external to the 
initial stimulus. This capacity for divergent synthetic thought may account for its reported 
superiority to visual imagery in creative thinking (Gordon, 1961; Walkup, 1965). 


Conclusions 
The association data reveal that the three forms of representation, verbal, visual, and 
kinaesthetic, have rather different properties. The semantic relations characteristic of 
kinaesthetic imagery hold together entities of similar logical types through actor-action-object 
frameworks; those of visual imagery hold together entities of different logical types, particularly 
whole and part; and verbal representation appears to provide a conceptual superstructure of 
relatively abstract knowledge, and also gives rise to a number of acoustically based associates. 
These differences in structure may account for the differential efficacy of the three modes of 
representation in creative thinking. 

The results suggest that the different forms of representation tap different aspects or strata 
of the semantic memory system. À more tentative suggestion, but one which follows Collins & 
Quillian (1972) would be that they also give rise to these different strata. The results leave open 
the question of whether these 'strata' are sufficiently different to be accounted separate stores, 
as Paivio (1971, 1975) holds to be the case for visual imagery and language; but they do show 
that different instructions concerning the representation of common subject matter can in effect 
‘tune’ the semantic system as a whole to reveal different types of semantic relationship. 
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Is the lemon test an index of arousal level? 


D. W. J. Corcoran and T. G. Houston 


On the basis of numerous studies, it was predicted that salivary output to lemon juice would increase in a 
noisy relative to a quiet environment. Following the accepted procedure (Corcoran, 1964) salivary output to 
lemon juice was measured under quiet conditions; then experimental subjects selected a level of noise ‘just 
too loud for comfort’ and the salivary index was reassessed. Controls were treated identically except that 
both measures were conducted in the quiet. It was found that salivation increased in noise, and the weight of 
saliva produced correlated with the level of noise chosen. 


It is now well established that introverts salivate more than extraverts to a few drops of lemon 
juice placed on the tongue (Corcoran, 1964; Eysenck & Eysenck, 1967 a, b, c). The salivation 
measure has retest reliability, being better than 0-9 in most cases, but the relationship with 
introversion does not hold when citric acid solution or certain synthetic juices are used. In the 
earliest study gross output did not change as a function of the time of day at which the measure 
was taken, nor was output changed after subjects had been without sleep for 43 hours. Negative 
studies have been reported (e.g. Ramsay, 1969), and contrary to the early findings it would 
seem that the time of day at which readings are taken does influence the relationship (Horne & 
Ostberg, 1975). 

The first study was conducted to test the hypothesis that introverts were characterized by 
higher levels of arousal than extraverts. Such a hypothesis seemed plausible on the basis of 
findings by the present writer (1965, 1972) by Blake (1967, 1971), Colquhoun (1960), Colquhoun 
& Corcoran (1964) and Blake & Corcoran (1970). But doubt was cast upon the interpretation of 
the salivation/introversion relationship in terms of arousal differences between the personality 
types, for the following reasons: (1) why did the output of salivation to citric acid not correlate 
with introversion?; (2) why was there no circadian variation in the salivary measure?; and (3) 
why did severe loss of sleep not reduce the measure? It therefore seemed that although the 
introversion/salivation relationship had been predicted from the arousal hypothesis, the 
underlying determinant of the correlation must have been of some other form. (Perhaps 
introverts possess certain gustatory or olfactory capacities which responded well to the volatile 
compounds in real lemon juice.) 

Recently, however, Horne & Ostberg (1975) have shown that salivary output to a synthetic 
juice does increase in the afternoon and Bent (personal communication), following the usual 
procedure, has shown that the output of saliva to lemon juice almost doubles from morning to 
evening. These findings, coupled with the fact that citric acid had been used for the sleep 
deprivation tests, was sufficient to persuade the present writers to reconsider the question of 
whether the salivation test does in fact measure arousal level. 

Some of the most compelling evidence supporting the notion that sleeplessness lowers arousal 
level (and that the normal variations in sleepiness during the normal working day are also 
determined by changes in arousal) have emerged from studies in which loud ambient noise has 
been used as an additional stressor. Corcoran (1962), showed that noise reduced the decrement 
in vigilance and serial reaction performance associated with loss of sleep. Wilkinson (1963), 
confirmed this finding, and Blake (1971) showed that noise improved visual search in the 
forenoon, but not in the afternoon. Hartley (1973), in a well-executed study confirmed the 
findings and showed also that the effect of noise upon vulnerable tasks is cumulative over time. 
Mullin & Corcoran (1977) varied the amplitude of both signal and noise proportionately (thus 
maintaining a constant S/N ratio) and showed that detection rate increased with the higher 
amplitude, but only when the task was conducted in the morning. It has also been demonstrated 
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in a recent experiment that this increase in detections at the low diurnal period results from 
changes in both statistics d' and £. Thus noise and sleepiness act in opposition to one another 
and if (sleepiness) is induced by low arousal then it is reasonable to argue that the other (noise) 
induces high arousal (Broadbent, 1963).[It is not implied that the only effect of noise is upon 
level of arousal (e.g. Hockey, 1970, Hamilton, Hockey & Rejman, 1977).] 

If salivation output does measure arousal level, then given the latter generalization it should 
increase in noise. In addition, work reported by Davies, Hockey & Taylor (1969) and, of more 
pertinence, Hockey (1972), has shown that introverts prefer low to high levels of audio input 
when engaged in a task. Thus it seems a possibility that salivation output will relate to that level 
of noise judged by the subjects to be ' just above a comfortable level'. These predictions were 
tested in the following experiment. 


Method 
Subjects 


Subjects were 22 undergraduates of the University of Glasgow, comprising equal numbers of each sex. Six 
subjects were assigned to a control group and 16 to the experimental group. 


Procedure 


The method described by Corcoran (1964), which has been followed by later workers, was employed. First a 
dental swab is placed over the sublingual salivary gland and allowed to remain there for 15 sec. It is then 
removed and the weight of saliva secreted into the swab is measured. Another swab is then positioned and 
four drops of real lemon juice are placed on the centre of the tongue. The mouth is then closed and the 
swabs removed after 15 sec. The weight of saliva in the swab is then subtracted from the weight obtained in 
the absence of lemon juice to determine the weight of saliva secreted in response to the lemon juice alone. 
The juice of several fresh lemons was mixed and the same mixture used throughout the trials. Although such 
a preparation is somewhat unstable, it was used in preference to more stable synthetic solutions for the 
reasons outlined earlier. 

Control subjects had their salivary output measured twice with an interval of 5 min between tests; during 
the interval the mouth was rinsed with water. The salivary index of experimental subjects was determined 
first in the quiet. The subjects were requested to select a level of white noise, which was ‘just too loud for 
comfort’. This procedure in conjunction with mouth rinsing took 5 min. Whilst the selected level of noise 
was on, the salivary measure was again taken. Thus control subjects had two measures taken in the quiet 
and experimental subjects had the first measure taken in the quiet and the second in noise, with equal time 
intervals between test and retest. All tests were conducted between noon and 2 p.m., when according to 
temperature readings (Blake, 1967), arousal level changes minimally. 

The noise was amplified thermal noise recorded on magnetic tape. It was relayed through a TEAK 
recorder and a separate amplifier to two obliquely positioned loudspeakers. The microphone of the noise 
meter (Advanced Components Type SPMI) was later placed in the position of the subject’s head to estimate 
the level of noise chosen. 


Results 


Within the control group salivation output decreased non-significantly from the first to the second 
test (Table 1). A rank-order correlation between first and second readings yielded a rho of 0-96 
(P « 0-02); this reliability measure agrees with estimates from prior studies, Within the 


Table 1. Salivation (mg) 


1st measure 2nd measure 

——————— ———— Difference 
Group Mean S.D. Mean S.D. (2nd minus 1st) 
Control 0-804 0-314 0-720 0-178 —0-084 


Experimental 0-512 0-080 0-872 0:192 10:360 
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experimental group, the equivalent retest correlation was only 0-125, which was non-significant 
indicating that the rank order of scores had changed from quiet to noisy conditions. 

The mean level of noise chosen was 67 dB, but the spread was large, ranging from 34 to 96 
dB. The mean weight of saliva increased in noise relative to quiet by about a third of a gram 
(t =2-755, P< 0-02) supporting the major prediction. 

Within the experimental group, rank-order correlations were calculated between the chosen 
level of noise and the two salivation readings. The correlation between the first salivation 
measure (in quiet) and chosen noise level was non-significant and in the reverse direction to that 
expected (rho = 0-101). However, the second salivation reading, taken whilst the noise was on, 
correlated positively and significantly with noise level (rho = 0-65, P< 0-02). Finally, the 
difference between the salivation measures taken in quiet and noise, i.e. the increment induced 
by noise for each subject, was correlated with the noise level chosen. This correlation was also 
positive and significant (rho — 0-56, P « 0-05). 

Discussion 

Contrary to expectations there was no evidence tbat high salivators tend to choose high noise 
levels. The study differs in what may be an important respect from Hockey's, however. 
Hockey's subjects were required to complete a task whilst the noise was on, and in order to 
reduce the noise which increased gradually over time, subjects had to depress a very stiff 
switch. Under these conditions introverts apparently opt to expend the effort in return for the 
‘luxury’ of working in a low noise environment. 

The predictions based upon the simple assumptions that noise increases arousal and that 
salivation measures it, were well confirmed. The overall level of salivary output was increased in 
the presence of noise, and the chosen noise level correlated both with the salivation index 
measured in noise and with the increment in salivation output from quiet to noise. Thus factor(s) 
independent of salivation output measured under quiet conditions determined the choice of noise 
level, but the noise level chosen apparently affects salivary output in proportion to its amplitude. 

It remains a matter of conjecture why the earlier experiment did not show circadian changes, 
although the main findings were replicated in other respects in later experiments. It still seems 
strange that synthetic juices are less effective. However, the present results are supportive of 
the notion that that state of the organism called its ‘arousal level’, which has been such a useful 


concept in so many studies, is related to the salivary index. 
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Tobacco smoking, personality and sex factors in auditory vigilance 
performance 


J. E. Tong, Gillian Leigh, J. Campbell and D. Smith 


A total of 120 university students comprising equal groups of male and female non-smokers, smokers not 
smoking and smokers smoking, were compared for performance on a 60 min auditory vigilance task. 
Non-smokers consistently detected more signals throughout the test. A significant interaction showed that 
while non-smokers detected fewer signals as the test progressed, smokers smoking increased their number of 
detections. There were no sex differences and no overall EPI differences in scores, although extraverted 
non-smokers gave significantly higher scores than introverted non-smokers with the converse being present 
for smokers. The results are discussed in relation to hippocampal functions and indicate that smoking should 
be taken into account in experiments involving sustained attention. 


A recent experiment by Gale et al. (1972) failed to replicate traditional findings of sex, 
personality and diurnal variations in an hour long auditory vigilance task. No difference in 
performance was found for extraversion, sex, or time of day factors. We have drawn attention 
in recent reports (Tong, Knott, McGraw & Leigh, 1974a, b; Leigh, Tong & Campbell, 1977) to 
the fact that cigarette smoking can be a powerful confounding factor in psychological research, 
especially when a prolonged testing session is involved. In some instances there are marked 
differences between deprived smokers and smokers tested after smoking and in other cases 
differences exist between non-smokers and smokers irrespective of the cigarette factor. 

Other reports also attest to the influence of smoking on vigilance and attentional performance. 
A lower number of observations was recorded by habitual smokers after smoking two or three 
cigarettes during a 36 min visual test (Hartley, 1973). This was attributed to a reduction in 
arousal as the test continued. Frankenhaeuser, Myrsten, Post & Johansson (1971) and Tarriere & 
Hartemann (1964) also report that smokers maintained initial levels of performance throughout 
a prolonged testing session only if cigarettes were given during the test but deprivation of 
cigarettes resulted in decreased efficiency over time. Conversely however, Johnston (1966) found 
that visual search improved 34 per cent for a group of habitual smokers who reduced or 
abstained from smoking for a period of two weeks. Smoking elevates the heart rate (Elliott & 
Thysell, 1968) and alters EEG negativity associated with attention (Ashton, Millman, Telford & 
Thompson, 1973). 

The present study was designed to investigate the performance of non-smokers and smokers in 
an auditory vigilance experiment very similar to that conducted by Gale et al. (1972) in order 
to explore the possibility that failure to control the smoking factor could have confounded 
the results. 


Method 
Subjects and apparatus 


A total of 120 university students between the ages of 18 and 30 took part in the experiment, divided into 
three equal groups with 40 subjects in each group. The groups were non-smokers (NS), smokers not smoking 
(SNS) and smokers smoking (SS). Each group was further equally subdivided by sex. The criterion for the 
smoker groups was that each person smoked at least 15 cigarettes per day. 

The apparatus was almost identical to that used by Gale et al. (1972). A 60 min auditory task, consisting of 
a taped recording of a continuous sequence of digits presented at the constant rate of one digit per second, 
was divided into five 12 min blocks with a 2 min silent interval between blocks. The pauses between blocks 
were inserted since several subjects in a pilot study found the continuous task too demanding and terminated 
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testing. There were 60 test signal groups consisting of a sequence of three consecutive odd but unequal 
digits. These signals were prepared from random numbers (constrained in not having signal sequences) and 
were distributed between blocks 1-5 with no less than ten and no more than 14 test signal groups per block. 
This was to counteract any expectancy on the part of the subject for receiving an equal number of test signal 
groups within each block. The test signal groups were randomly assigned within a block with no less than 
10 sec and no more than 110 sec between them. 

The experiment was conducted in a 12-booth language laboratory with individual booths which restricted 
the subject’s viewing. The digits were received through headphones with individual volume control. The 
cigarettes used were of a regular king size commercial filter-tip brand, reported to contain 1-3 mg nicotine per 
cigarette. 

Design and procedure 


The three groups (NS, SNS, SS), subdivided for sex, formed the basic design of a factorial 3x2 with 
repeated measures for the five blocks of the task. 

Smokers (SNS and SS) were asked to refrain from smoking for 3 hr prior to testing. All testing took place 
in the early afternoon and all subjects were asked to take a light lunch without coffee or tea. In order to 
compare results with Gale et al. (1972), subjects first completed the Eysenck Personality Inventory Form B, 
during which time SS subjects smoked their first cigarette, bemg instructed to inhale normally and smoke to 
a mark 5 mm above the filter tip. They then listened to the following instructions on prerecorded tape: ‘On 
the following tape you will hear digits from 1 to 9 occurring at the rate of one digit per second. Your task is 
to listen to the digits and select all sequences of three consecutive odd but unequal digits that occur 
randomly throughout the tape; on your desk is recording paper, on which you will make a tick to represent 
every digit you hear. When you come to a sequence of three consecutive odd but unequal digits, for example 
3-7-9, your task will be to record the last digit in that particular sequence, 9 in the above example. They 
must be unequal and they must be odd. For example, a sequence such as 3-9-3 would not be considered.’ 
Then followed a 5 min practice trial under test conditions, during which time SS subjects smoked a second 
cigarette. There was also a pause for questions concerning the task. When the task requirements were clear 
to all subjects, the five 12 min blocks were presented. The entire session lasted approximately 90 min. 


Results 

The responses were scored according to the number of correct test signal groups of digits per 
block detected. For analysis purposes scores were expressed as percentages of total possible 
detections within each particular block. The number of false alarms was very small, i.e. an 
overall average of far less than one per person per block. 

Using FANOVA factorial analysis of variance, i.e. 3 (groups)x2 (sex)x5 (blocks) a significant 
overall difference between groups was found (F= 6:21, P< 0-002), but no main effect for the sex 
factor and no sexxgroup interaction. Non-smokers showed consistently higher performance, 
with an overall mean per cent score of 74-9 compared with 65-3 and 62-2, for the two smoker 


Table 1. Mean per cent scores (number of correct detections) and standard deviations for 
non-smokers, smokers not smoking and smokers smoking by sex for blocks 1-5 


1 2 3 4 5 
Blocks %M SD. 9$ M SD. 9$M SD. 9M SD. 9$M SD. 
NS M 80-0 2-21 72:9 2-12 82-1 1-98 73-6 1-99 79-0 1:29 
F 80-7 1-90 70-4 1-95 73-6 2:27 69-1 2:34 68-5 1-95 
SNS M 71-1 2-55 66-3 2.43 71-1 3-33 66:4 2:71 68-0 2:16 
F 63-8 2-45 56:3 2.48 70-4 2-90 60-9 2:77 58-5 2-58 
SS M 56.9 3-63 575 3:37 70-4 3-51 63-6 3-17 64-0 2:66 
F 58-1 2.28 54-6 2-35 63-2 3-55 65-5 2-16 68-0 1-70 
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Figure 1. Percentage correct signals for the three groups, non-smokers, smokers not smoking, and smokers 
smoking across five testing blocks. O, non-smokers; V, smokers not smoking; O, smokers smoking. 
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Figure 2. The relation between extraversion and neuroticism scores for non-smokers and smokers, male and 
female, obtained from Eysenck Personality Inventory Form B. 


groups, SNS and SS respectively. There was also a significant block factor (F= 6-02, P « 0-0001) 
and a significant groupsxblocks interaction (F= 2-40, P « 0-015) indicating that NS showed a 
slight deterioration over blocks, whilst SS improved and SNS remained relatively stable (Fig. 1). 
The significant blocksxgroups interaction warranted further analysis by blocks. Hence, taking 
each block as a separate dependent variable, univariate F tests indicated the following group 
differences: NS vs. SNS, block 1, F=22:52, P< 0:0001; block 2, F= 10-17, P< 0-002; block 3, 
F=4-85, P «0-029; block 4, F— 2-60, n.s.; block 5, F= 4-96, P« 0:028. For SNS vs. SS, the 
only significant difference was in block 1, F=5-14, P< 0-025. Similar analyses for sex within 
each block and sexxgroup interaction within blocks were not significant. Correlation coefficients 
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Table 2. ANOVA summary tables for auditory vigilance performance by sex, smoking group 
and test block 


(a) FANOVA factorial analysis 


d.f. F P 
Groups 2 6-216 < 0-002 
Sex 1 1-699 < 0-195 n.s. 
Sexxgroups 2 0-326 < 0-722 n.s. 
Block 4 6-022 < 0:0001 
Block x groups 8 2-403 < 0-015 
Block x sex 4 0-527 < 0-716 n.s. 
Blockx group X sex 8 1:166 « 0:318 n.s. 


(b) Univariate analysis of variance: analysis between groups within blocks 


NS vs. SNS SNS vs. SS 

F P F P 
Block 1 22-526 < 0-001 5-254 < 0-024 
Block 2 10-177 < 0-002 1-262 < 0-264 n.s. 
Block 3 4-847 < 0-029 0-676 < 0-413 n.s. 
Block 4 2-593 < 0-110 n.s. 0-031 < 0-861 n.s. 
Block 5 4-963 « 0-028 0-338 « 0-562 n.s. 





between blocks for both groups were consistently between 0-43 and 0-65. The mean percentage 
scores are given in Table 1. 

There were no overall sex or smoker vs. non-smoker differences on the Eysenck Personality 
Inventory for either the extraversion or the neuroticism scores. Female smokers were slightly 
higher on the neuroticism scale than female non-smokers, and both male and female non- 
smokers were slightly more extravert, but few scores were extreme (Fig. 2). When extraversion 
scores in each of the three groups were divided at the 50th percentile to compare the vigilance 
scores of the more extraverted with those of the introverted subjects, NS showed a significant 
difference at block 3 (t— 2-83, P< 0-008) and SS at block 1 (t=2-215, P< 0-033), block 3 
(t— 2-99, P« 0-005) and block 4 (1£— 2-069, P< 0-045). There were no differences for SNS. For 
the non-smokers this was in the direction of higher scores for.extraversion, but the opposite was ' 
true for smokers. A comparison between extraverted NS and extraverted SNS showed 
significant differences at block 1 (t= 4-247, P< 0-0001) and block 3 (t — 3-23, P « 0-003) and 
also between extraverted NS and extraverted SS at block 1 (t— 7-03, P< 0-001) and block 3 
(£— 5-146, P « 0-0001). There were no differences for similar comparisons for the more 
introverted subjects. 


Discussion 

The first conclusion to be drawn from these results is that smoking can be an important factor 
influencing performance on tasks requiring sustained attention. Not only does smoking bring 
about a treatment effect but there are also group differences between smokers and non-smokers 
irrespective of whether a cigarette was used prior to testing. It is not clear at present to what 
extent nicotine facilitates performance, since the predominantly stimulant pharmacological effect 
(Russell, 1971) is not always reflected in improved performance, e.g. in visual search (Johnston, 
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1966), patellar reflex (Domino & von Baumgarten, 1969), verbal rote learning (Andersson, 1975) 
simple reaction time (Cotten, Thomas & Stewart, 1971) or multiple task performance (Schori & 
Jones, 1974). However, performance on other tasks has been facilitated, especially within the 
first 30 min of smoking a cigarette, in choice reaction time (Myrsten, Post, Frankenhaeuser & 
Johansson, 1972; Lyon, Tong, Leigh & Clare, 1975), temporal discrimination (Tong et al. 
19745), critical flicker fusion (Warwick & Eysenck, 1963) and tapping and pursuit rotor tasks 
(Frith, 1967, 1968). 

It was expected that the SS group would produce scores at least equal to the SNS group, but 
they were initially lower and showed steady improvement over the test. A recent report by 
Andersson (1975), indicated that smoking induced a decline in learning, but better retention later, 
compared with the non-smoking group, the difference being most pronounced in the first 
post-treatment block. These results were considered in the light of Walker’s consolidation theory 
(1958) where material learnt under conditions of high arousal lead to better permanent memory 
but a less available trace in short-term memory. Since our SS subjects were smoking during the 
training period, this may account for their poorer performance in block 1. 

The introversion-extraversion results may be compared with those of Tarriere & Hartemann 
(1964) in which the performance of smokers and non-smokers were compared in relation to their 
position on the extraversion scale. They reported that the rather introverted subjects displayed 
superior performances to the rather extraverted subjects and also that at a similar level on the 
introversion-extraversion scale, smokers displayed a better performance than non-smokers. 
Although no statistical data were given, their findings receive some support from the present 
results for the smokers; however, for the non-smokers the position is reversed, but in one block 
only. 

Since the present task demanded constant monitoring of the digits, the difference in scores 
between the three groups compared may be due to an attentional factor. When nicotine is 
injected intravenously, it is primarily accumulated in the hippocampus (Schmiterlów et al. 1967), 
an area suggested as the coordinating centre for arousal and activity to produce effort in the 
control of attention, that is, the required response (Pribram & McGuinness, 1975). It may be 
important in future research to discover whether nicotine may not cause some temporary, or 
perhaps relatively permanent changes in ability to attend over a concentrated period of time. 
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A psychophysiological study of paranoid hostility and defensiveness in 
maximum security hospital patients* 


John Hinton 


In this study male security hospital patients are compared against male students on L scores, and on the 
relationship between scores on Eysenck’s P scale and an index of change in muscle tension on attending to 
perceptual discrimination tasks. In the student sample P score gave a significant positive correlation with the , 
EMG index, while L scores were remarkably low. Results on the hospitalized population showed that the 
measure of increase in muscle tension was again very significantly correlated with P, but the correlation was 
negative. This switch is highly significant (P< 0-0001). Also, with patients, a significant positive correlation 
occurred between L and the EMG index. The results are interpreted by reference to L scores, since the 
deviant population had significantly higher L scores than students (P< 0-001). It is argued that many patients 
detained for an undefined period are highly defensive and fake socially good. It is proposed that those faking 
most may be the extreme paranoid and hostile types who, if they had nothing to lose by being honest, would 
normally endorse a large number of items on the P scale. 


From an experimental investigation on university students it was concluded that, to a large 
degree, Eysenck’s P scale measures the extent of endorsement of paranoid attitudes (Hinton & 
Craske, 1976). Most of the items in this scale can be readily identified as relating to paranoid and 
hostile outlook, sadism and lack of social conformity. Results reported by Verma & Eysenck 
(1973) show that in a sample of mental hospital patients, P correlated very significantly with 
measures of paranoia and hostility. Observations of high P scoring students suggest that these 
individuals are particularly hostile, irritable and paranoid in their behaviour in laboratory test 
situations (Hinton, 1975). From his investigation on Rampton patients, Davis (1974) concluded 
that high P scorers have a greater tendency to make negative evaluations. The above indications 
are fairly consistent with the speculation by Gray (1971) that P is a dimension of personality 
relating to activity level in the amygdaloid flight/fight system. 

In their paper of 1976, Hinton & Craske review evidence which indicates strongly that an 
index of increase in generalized muscle action potentials on perceptual and performance tasks, 
correlates positively with the effort made on such tasks (Benson & Gedye, 1961; Eason & White, 
1961). The extent of muscle tension increase depends on degree of difficulty perceived and 
experienced, as well as other motivational factors. In their student sample, Hinton & Craske 
(1976) found that P score correlated very significantly with a measure of percentage increase of 

le tension in the neck trapezius. This muscle was chosen because Eason & White (1961) 
Hf that it provided the best single index of generalized motor action potentials in 
levant to task performance. Hinton & Craske conclude that ‘Eysenck’s P scale 
noid reaction tendencies which are linked to attention difficulty — this being 
crease of muscle tension on tests where decisions are required without feedback’. 
blems apply in regard to the validity of psychometric measures on detained patients 
by Davis, 1973). Black (1963) pointed out that Broadmoor patients show high 
ss (K scores on the MMPI). This appears to increase with length of incarceration and 
desire to be discharged (Black, 1974, personal communication). Furthermore, L scores 
have also been found to be significantly positively correlated with the percentage of 
stitutions (Black, Webster, O’Neill & Hinton, 1977). It is generally found that lie scores 


views expressed in this paper do not necessarily reflect those of the institution or Government 
nt in which the Author is working. 
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(faking socially good) are much lower in students relative to the population in general. In the 
student sample used by Hinton (1975) L scores were significantly lower than those generally 
reported for students. Thus, it may be assumed that students are more likely to endorse paranoid 
or hostile attitudes (if they apply) than maximum security hospital patients. . 
In investigations using a scale like P, where face validity is high, faking must vary 
considerably, dependent on situational factors such as the experimenter’s approach, the 
institutional setting and the perceived relevance of the test to the individual’s future prospects. 
Thus, inclusion of a measure of dissimulation (L) is important in interpreting psychometric data, 
especially when comparing populations in different settings. 
In this investigation, it was required to test whether P score in criminally deviant patients 
* correlated significantly with the EMG index which had been found to discriminate high from low. 
P scoring students. It was hypothesized that, assuming L scores were fairly low, then P score 
would correlate, with EMG increase on perceptual discrimination tests in maximum security 
hospital pag g 
Method . 

. Student sample 
Twenty-seven students aged 18-30 were tested i ina intensity laboratory setting. Perceptual tests were given 
involving simple auditory and' visual discriminations: decisions were required on a tachistoscopically 
presented brightness matching task and on a sound lateralization task. For the latter, white noise pulses were 
presented over.earphones using inter-aural time delays and intensity differences, and the subjects judged 
when the sound source appeared to reach the side of the head. (For full details see Hinton, 1975. ) 


Patient sample 


Eighteen male socially deviant patients were tested i in a maximum secutity hospital. Age range of patients 
was from 22 to 40 years, and length of detention in the hospital ranged from 4 to 12 years. Most types of 
abnormal offender were included (murderers, sexual offenders, arsonists, etc.). Half of the patients were 
given the legal/medical category of ‘psychopathic disorder’, and the other half were designated ‘mentally 
ill’. Intelligence levels were within the average range. Patients were tested on auditory and visual perceptual 
discrimination tests. These tests included judgement of shift of a sound source in a stereophonic free 
listening situation and judgement of apparent movement of a light on a visual autokinesis test. The latter test 
was given under conditions where the subject was led to believe. that light movement - if it was seen — was 
produced by the experimenter. It is unfortunate that the perceptual tests given to students and patients were 
not identical. However, it was assumed that this: would not be important, since the tests involved essentially 
the same sort. of subjective decision making with no feedback of correctness of responses. Also, musei» 
tension responses to both audio and visual tests were highly correlated. Both patients and students indicated 
their judgements simply by pressing a button. For both groups,of subjects the actual laboratory test settings 
were as near identical as possible. The test booth was controlled at a temperature of 25°+1° C. zs 
Both students and patients were asked to complete the PEN. inventory after undergoing the test session. 















EMG recording 


A Devices recorder was used for both student and patient samples. A Devices high gain ampi 
employed with time constant set at 0-3 sec. High frequencies were not filtered. A 2 sec epoc' id 
for integration of motor action poroaren All ae were taken from the addu record. 1 


Electrode assembly 


A pair. of 10 mm silver disc electrodes were attached to the skin above the right neck trapezi 
earth technique (Hinton, 1975) was employed to reduce ECG skin potential artefacts. All leads 
screened to eliminate 50 Hz interference and reduce high frequency loss. Electrode-skin resistan! 
reduced to below 6 Kohms. 


Y 


Treatment of data’ 


The mean integrated EMG was calculated for the 4 sec period prior to decision making on all tests. 
from all perceptual tests were combined, since increase in muscle tension appeared to depend little 


s 
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type of discrimination test. Minimum levels of integrated EMG were also recorded. Integrated EMG on test 
was expressed as a percentage of the minimum tonic level. Thus, inter-subject differences in the number of 
effective motor units were eliminated (these differences being due partly to electrode positioning). A 
test-retest reliability check between separate testing sessions, gave a correlation of 0-66 (P< 0-001 for n of 
39). This check was carried out using male and female students (Hinton, 1975). 


Results 


The sample statistics (Table 1) show that the mean scores on EMG percentage increase and P do 

not differ significantly between students and patients. However for the patient group P scores 

tend to be unexpectedly lower, while the EMG percentage increase is greater than for students. 
The range of L scores in the students sample is exceedingly low, with no subjects going above 

4. The mean L score is very significantly lower for the student group, relative to patients 

(P« 0-001). Table 2 shows the highly significant positive correlation between P and EMG 


Table 1. Statistical comparisons between student and special hospital samples 


Students (n = 27) Special hospital patients (n — 18) 
Mean S.D. Mean S.D. 

EMG % 179-9 69-9 211-2 94-1 

P 4-6 1-9 3-7 1-9 

L 20 1:3 6-8 5-2 





L scores are significantly lower in student sample compared to patients (P< 0-001. Student's f) 


Table 2. Correlation matrices comparing students with special hospital patients 


Male students (n = 27) Male patients (n — 18) 
EMG % P L EMG % P L 
EMS 96 
P +0-540* —0-659* 
L 4-0-136t —0-342 +0-628t —0-476 
(For P< 0-05, r> 0-381) (For P« 0-05, r» 0-468) 
.( | P«0-001, r> 0-487) ( P<0-01, r» 0.590) 


ua-— ps f difference between correlations (two-tailed test): * P< 0-0001; t P«O0-10. 

had shown, 

muscles it . ange in the student sample. N and E correlated almost zero with the EMG index. 
measures p3' cave a completely different pattern in fact, the correlation between P and EMG per 
reflected in ied, Instead, the EMG index correlates positively with the ‘fake good’ score (L) in 


Special prcatients. Both correlations are statistically significant (P< 0-01). 
(summarized 


defensiven“ 

increasing / to the EMG index, the personality test scores are open to easy faking. However, the 

on the Elives an indication of the extent to which the individual is faking socially good. A group 

life in induals with high mean L scores might be expected to be generally defensive in responses, 
* The.ly to the P scale, since in the case of this scale it is easy to identify ‘socially bad’ items 

depart to paranoia or suspiciousness, sadism, and non-conformity in social attitudes. 
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Students tested under conditions of anonymity have no reason to feel threatened and are not 
likely to be as defensive as maximum security hospital patients. Many of the latter may see good 
reason for responding defensively on questionnaires, thus attempting to show themselves in a 
socially desirable light to improve their chances of discharge. This reasoning could explain the 
fact that students had very low ‘lie’ scores, while detained patients gave significantly higher ‘lie’ 
scores. Also in line with this view, is the finding that patients had slightly lower mean P scores 
than students. 

The above proposals would be merely reasonable conjecture were it not for the patterns of 
correlations of P and L with the objective EMG index. The argument now depends on a number 
of assumptions (a) that the EMG index is unfakable, (b) that paranoid reaction tendencies are 
measured by the EMG index (Hinton & Craske, 1976), and (c) that the students were honest in 
their responses to the P scale. The results should be considered like this: under conditions of 
anonymity, L scores are low and the EMG index correlates positively with P. Under conditions 
of indeterminate detention this dissimulation is high and the EMG index correlates negatively 
with P. In explanation, it is proposed that, with detained patients, the more paranoid, suspicious, 
sadistic and hostile they are, the more defensively they respond in answering P items. Hence the 
reversal of correlation between P and EMG index. This should not be taken as a simple 
causative argument however. It is possible that a minority of patients endorse socially bad items 
because they feel they deserve to be punished or because they adopt a ‘bloody-minded’ attitude. 

The results of this investigation could have important implications (a) for psychometric 
research, where individuals in institutions are compared against non-institutionalized controls and 
(b) for the assessment of patients in maximum security hospitals, where the EMG index could 
perhaps provide a useful objective check on the degree of hostility and suspicion in patients, and 
taken in conjunction with psychometric data it could help in evaluating degree of faking. The 
index therefore might be particularly helpful in making discharge assessments, when many 
patients are at their most defensive. 

A follow-up validation study is currently in progress on maximum security hospital patients, 
using a measure of paranoid hostility which is not easily open to faking. This further research 
employs a repertory grid measure (Howells, 1975). The correlation of this measure with the 
EMG index is being investigated on admission patients. It is worth noting that the patients tested 
in this study had all been under detention for many years. Probably quite different results would 
be obtained on less defensive newly admitted patients. 
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Tactual and name matching by blind children 


Susanna Millar 





The hypothesis that letters can be matched on the basis of tactual physical features was tested in three 
experiments with blind Braille-reading children. 

Experiment I compared simultaneous matching of pairs of normal (S) and enlarged (E) format Braille 
letters. Under physical match instructions latencies for SS pairs were significantly faster than for other pairs, 
but matching EE pairs which compared two difficult letters under physical match instructions was no slower 
than matching SE or ES letters which compared one easy and one difficult letter under name match 
instructions. 

Experiment II showed that successive physical matching of SS and EE letters was significantly faster than 
name matching of ES and SE letters respectively. 

Experiment III tested successive matching with two types of altered format (X and Y) under instructions 
to judge whether pairs were the same letter. Latencies did not differ between XX and YY pairs, but each 
was significantly faster than either XY or YX pairs. The results were deemed to support the hypothesis. 





The study is addressed to the question how tactually presented verbal material is represented in 
memory. À growing number of findings suggest that in addition to the expected verbal encoding, 
(Conrad, 1964; Sperling, 1967; Atkinson & Shiffrin, 1968; Norman, 1969), tactual features of the 
input may also be represented in memory: Watkins & Watkins (1974) found a tactile suffix effect. 
Left hand superiority for reading Braille was found by Hermelin & O'Connor (1971) who 
suggested that the letters are analysed by the minor hemisphere as patterns. Millar (1975a, b) 
showed that similarity in tactual characteristics of successively presented nameable objects and 
Braille letters produced recall decrements by blind children. A strategy of using convergent 
methods is desirable in testing the psychological reality of a finding. But the hypothesis that 
tactual features can be utilized in memory also needs to be tested directly. 

Evidence that memory matches can be based on physical visual characteristics prior to naming 
has come largely from studies showing that matching successive letters which have the same 
visual form and the same name (AA or aa) is faster than matching letters which have the same 
name (Aa or aA) but differ in shape (e.g. Posner, Boies, Eichelman & Taylor, 1969). Similar 
effects have been found for children (Hoving, Morin & Konick, 1974). Since visual form 
affected judgements of successive letters, it must have been coded in memory. On the 
hypothesis that tactual characteristics of verbal material are encoded, and can be utilized 
as a basis for matching, similar results would be expected. The present study explores this. 

The obvious tactual analogue to the visual physical/name match paradigm is that of Braille 
letter matching by congenitally or very early blind subjects where confounding effects of 
possible visual recoding may be excluded. The problem also has implications for Braille letter 
recognition by blind children. One difficulty is that, unlike the sighted alphabet which has upper 
and lower case forms of approximately equal familiarity, Braille is taught in one format only. 
However, if, as is sometimes suggested (Nolan & Kederis, 1969) fluent Braille readers apprehend 
Braille letters as patterns, they should be able to generalize the names to enlarged forms. Size 
differences in visual letters have been shown to affect latencies in physical/name match 
paradigms (Corcoran & Besner, 1975). But unlike visual letters, enlarged Braille letters (e.g. 
double size) would be less familiar than the standard format. Even with pre-test training naming 
latencies for enlarged forms may be longer for that reason. The relevant comparisons would, 
therefore, be between latencies for matches in which the test letter is in standard (S) format, and 
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the initial letter is in either enlarged (E) or standard (S) form; and between matches in which the 
test letter is in enlarged (E) format, and the initial letter is in either (S) or enlarged (E) form. 

A question which needs to be ansawered first is whether simultaneous matching can be based 
on physical tactual characteristics. Longer latencies for enlarged (E) letters could occur either 
because the format is less familiar, or they take longer to *feel' or because names are less well 
associated with the format, or any combination of these. In either case, simultaneous matches 
would involve two difficult letters for EE, two easy letters for SS matches, while ES or SE 
matches each compare one difficult (E) and one easy (S) letter. On the null hypothesis that all 
matching is based on naming, the latencies for EE and SS (two easy, two difficult letters) would 
equal the latencies for SE and ES (one easy and one difficult letter in each) matches. On the 
hypothesis that physical tactual matches are faster (EE SS) < (ES+SE) latencies should be 
obtained. The first experiment was designed to test this, while the second experiment concerned 
successive matching. Since there are relatively few children who are totally blind or have only 
minimal light but no shape perception from birth or very early infancy, the same subjects served 
in both the first two experiments, the order of testing being reversed for half the subjects. 


Experiment I 
Method 


A subjectsxexperimental conditions xtest type (same-different judgements) design was used with pairs of 
simultaneously tactually presented Braille letters as follows. 


Materials. Four sets of 10 Braille consonants, found previously to be least confusable tactually as well as 
phonologically, were used; two sets in standard (S) and two sets in enlarged (E) format. The S format letters 
were on standard ‘Unilock Word Building Device’ individual (2:25x2-5 cm) plaques which can be 
interlocked for easy presentation. The E format letters were the same letters, enlarged by doubling the 
intervals between Braille cells in each letter, thus preserving the configuration of dots but doubling the size 
of the configuration. The E letters were produced by means of a Braille stylus on (2:25x2-5 cm) Braille paper 
squares, which were fixed to individual blank plaques from the *Unilock Word Building Device' set. For 
durability, the Braille dots were filled with fast-hardening plastic before fixing. The letters were 
CLNHGTMKSR("i:5u;glLDlit-) Letters were presented on the *Unilock Word Building 
Device' tray in which interlocked letter plaques can be lodged securely. 


Subjects. Subjects were 12 children from two residential schools for the blind. The criteria for selection were 
that they should be either totally blind or have only minimal light but no shape perception from birth or less 
than 20 months of age, and could name Braille letters fluently. There were five girls and seven boys, mean 
age, 10-2 (from 9:4 to 11:1), ranging from low average to superior intelligence. 


Pre-tests. All subjects were pre-tested on naming S Braille letters first. Each letter was presented singly, in 
random order, and latencies from touch to naming were recorded by hand on a TC11 timer. The criterion for 
reliable naming was two consecutive errorless runs with all letters. Only subjects who met this criterion were 
included in the experiment. The same procedure was then followed with all E Braille letters. Subjects who 
made any mistakes on these were given training (feedback) trials until they reached the criterion of two 
consecutive errorless runs. Prior to runs with either S or E letters, subjects were told that these were Braille 
letters which they were to name. They were also informed prior to E runs that in each E letter the interval 
between dots was twice the normal size, so that E letters were larger, but had the same shape as S letters. 
Subjects were reminded to palpate the whole plaque in order not to miss dots in a letter. 

For E shapes, two of the 12 subjects made no errors on the initial runs; four made one error each on the 
first run, and required only one training run to reach the criterion. The other six subjects made between two 
and nine mistakes overall, and required a mean of 8-2 trial runs to reach criterion (two errorless runs). Mean 
naming latencies on criterion runs for S and E letters were given in Table 1. This shows that latencies for 
naming E letters were significantly longer (P< 0-025 on correlated t test) than S latencies. The difference was 
significant also for subjects who required no or one training trial (group I), but was clearly larger for the six 
subjects who had required more training trials (group II). In the subsequent analyses, the two groups were 
consequently treated separately, constituting a between-subject factor. 
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Table 1. Mean latencies (msec) on criterion runs for pre-test naming of S (standard format) and 
E (enlarged format) letters by group 1 (zero or one trial to criterion) and group II (mean of eight 
trials to criterion) Expts. I and II 


S E 
Group I 1308 1902 
Group II 1581 4839 
Means 1445 3371 


Experimental conditions. Each subject was tested under four experimental conditions: (1) SS, pairs of letters 
drawn from the two sets of standard format letters which were either the same or different in name and 
shape; (2) EE pairs of letters drawn from the two sets of enlarged letters which were either the same or 
different in name and shape; (3) ES, pairs of letters in which the letter on the left was always drawn from 
the enlarged set and the letter on the right was always drawn from the standard set. The two letters either 
had the same name and configuration but differed in format, or differed in both name, shape, and format; (4) 
SE, pairs of letters in which the letter on the left was always drawn from the standard format set, and the 
letter on the right was drawn from the enlarged format set. The letters either had the same name and 
configuration but differed in format, or differed in both name and configuration and format. 

The four conditions were blocked in across subject counterbalanced order. Prior to each block of runs, 
subjects were informed of the nature of the letters (both standard / both large / the left one large, the right 
one standard / the left one standard, the right one large / as appropriate. It was considered that the necessary 
pre-testing on naming might induce a 'set' not to respond until both letters had been named in all conditions. 
Consequently, instructions for conditions 1 and 2 stressed that although both shapes were letters, subjects 
were not to name either, but to judge only whether or not they felt the same (had the same shape). 
Instructions for conditions 3 and 4 stressed that subjects should judge whether or not the two letters had the 
same name regardless of the difference in format (feel). 


Test type. There were ten tests for same and ten for different pairs in each condition. For ‘same’ tests all ten 
letters were paired once with same letter (shape or name as appropriate), presented in different random order 
for each subject and condition, interspersed with ten pairs of different letters, randomly selected with the 
restrictions that the letter pairs immediately following each other should not repeat letters, and each 
*different' combination should occur only once in any condition. Same/different tests were presented in 
Gellerman type order. 


Procedure. Subjects were tested singly in a quiet room of their school. The experiment was explained as a 
game in which the subject had to match two letters as same or different as quickly as possible, but without 
making mistakes. Equal numbers of ‘Smarties’ were given to all subjects as prizes for being quick and 
accurate. 

For each run, an interlocked letter pair was lodged in the ' Unilock Building Device’ tray. Subjects were 
asked to palpate the pair simultaneously and to say ‘same’ or ‘different’ as soon as they could. Most 
subjects used the index fingers of both hands, or the index and middle fingers of either one or both hands, or 
going from left to right and back again. No restrictions were placed on the method of exploring the two 
letters. Latencies were recorded by hand on a TC 11 timer from first touch to response ('same' or 
' different"). 
Results 
Error rates were low, 2-6 per cent overall. They were lowest in the SS condition: 0-46 per cent, 
and 2-07, 2-89 and 4-96 per cent in the SE, ES and EE conditions respectively. The EE errors 
did not differ significantly from either SE or ES errors on Wilcoxon test. 

Mean latencies for correct responses were subjected to a subject groups x experimental 


conditions (SS, EE, ES and SE)xtest type (same/different) ANOVA with repeated measures 
on the last two factors. No effects of subject groups or of test type were found, and these did 
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Figure 1. Mean latencies (msec) for ‘same’ and ‘different’ judgements of simultaneously presented Braille 
letters under physical matching conditions which both letters were in standard format (SS), or in enlarged 
format (EE), and under name match conditions in which one letter was in standard and one in enlarged 
format (SE or ES). 


not interact with any other condition. Experimental conditions had a highly significant effect 
(F = 27-87, d.f. =3, 30, P< 0-001). Mean latencies for all conditions are graphed in Fig. 1. This 
shows that the effect was mainly due to the SS condition. This differed from all other conditions 
at the P<0-01 level, on Newman Keuls tests. Latencies for the E conditions did not differ from 
either SE or ES latencies. 

Since test type did not have a significant effect on latencies, it may be concluded that the 
difference in discriminating on feel (format) and names was similar for ‘same’ and ‘different’ 
test pairs. However, results for ‘same’ test pairs are of particular interest. A separate subject 
groups xexperimental conditions ANOVA was, therefore, run on latencies for correct ‘same’ 
test responses alone. Experimental conditions had a highly significant effect (F= 14-56, d.f. = 3, 
30, P< 0-001) and differed significantly (P< 0-01) from each of the others, while EE did not 
differ from either ES or SE latencies. 


Discussion 


It was clear that the main effect showing faster latencies under physical match instructions was 
due to SS matches which compared two easy letters. Latencies for EE matches were 
significantly longer. It is, therefore, reasonable to conclude that either longer ‘feel’ time, or 
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familiarities of the format affected responses. These could not have been the sole factors; 
otherwise it would have to be assumed that the unfamiliar format of one letter in a match 
increased latencies to the same level as two unfamiliar letters (see also Expt. III). More 
importantly, the results are not consistent with the null hypothesis that all matches were based 
on naming. In that case, matches should either have been equal; or, if it depended on naming 
speeds, matching two difficult EE letters should have been significantly slower than ES or SE 
matches in which one letter was slow and one fast to name. The finding that SS matches were 
fastest, but SE and ES matches were as slow as EE matches is better explained by the 
hypothesis that naming added to ES and SE discrimination time. This would offset the potential 
advantage of having one easy letter in the SE and ES matches respectively. In any case, the 
findings suggest that the physical format affected matching. 

Although these results are consistent with the hypothesis that subjects can base judgements on 
tactual characteristics of the inputs, they do not answer questions about effects of physical 
tactual characteristics on memory. In simultaneous judgements both standard and test stimuli are 
physically present. The findings thus speak only to effects of physical characteristics on 
perceptual discrimination. A second experiment was therefore performed, using successive 
presentation of letter pairs, which requires that characteristics of the standard are coded in 
memory. 


Experiment II 


The experiment was designed to test the hypothesis that physical tactual features of letters in 
memory affect responses and that successive matching under physical match instructions is 
faster than matching based on naming. Since S letters were easier than E letters (pre-tests and 
Expt. I), the relevant comparisons are between latencies for the second (test) letter in the two 
conditions in which these are in the same format but standards differ in that they are either in 
the same format as the test letter, or differ in format from the test letter, but can be matched on 
name. On the null hypothesis that matching successive letters is based on naming, latencies to S 
or E test letters should be equal, regardless of the physical format of the first letter so that 

(S)S = (E)S, and (E)E =(S)E. The hypothesis that (successive) physical matches are faster 
predicts latency differences of the form (S)S < (E)S, and (E)E < (S)E. 


Method 


A subject groupxexperimental conditions xtest type (same/different) design was used. Subjects, materials, 
experimental conditions, and procedures were identical with those in Expt. J, except that the two stimuli to 
be compared in each run of the four (SS, EE, SE and ES) experimental conditions were presented 
successively and the relevant response latencies were from presentation (touch) of the test (second) letter to 
response. 


Procedure. For each run, an interlocked pair appropriate to one of the experimental condition ((E)E, (S)S, 
(E)S or (S)E) was lodged in the ‘Unilock Building Device’ tray, standard on the left, and test letter on the 
right. The subject had to feel first the standard and then the test letter, using one finger of the preferred 
hand, without moving back to the standard letter. Subjects had to judge whether the second (test) letter was 
the same as or different from the first letter. Both speed and accuracy were stressed in the instructions. The 
distance between the midpoint of the interlocked standard and test plaques was 2-25 cm. Subjects varied 
somewhat in the speed of moving from the first to the second letter; the average ISI being approximately 

1 sec. Latencies were recorded by hand on a TC 11 timer from the first touch of the second letter to the 
response. There were ten ‘same’ and ten ‘different’ pairs in each experimental condition, presented in 
Gellerman type order, and experimental conditions were blocked in across-subject counterbalanced order as 
for Expt. I. Subjects were informed of the conditions prior to each block of runs. As above, the (E)E and 
(S)S conditions were run under instructions to judge on feel (physical format) only. For (E)S and (S)E 
conditions instructions were to judge on name only, regardless of feel. 
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Results 


Overall errors were low, 2-70 per cent. There were 4-17, 2-05 and 4-58 per cent errors in the 
(E)E, (S)E, and (E)S conditions, respectively. These did not differ significantly from each other 
on either the sign or Wilcoxon non-parametric tests and thus did not indicate a trade-off relation 
with latencies (see below). No error occurred in the (S)S condition. 

Mean latencies for correct responses, graphed in Fig. 2, were subjected to a subject 
groups x experimental conditions (SS, EE, ES, SE)xtest type (same/different) ANOVA. Neither 
subject groups, nor test type had significant effects, or interacted with any other factor. The 
experimental conditions effect was highly significant (F= 16-43, d.f. =3, 30, P< 0-001). 
Pre-planned t tests showed that (S)S latencies were significantly faster (P< 0-005) than (E)S 
latencies, and (E)E latencies were significantly faster than (S)E latencies (P< 0-025). The 
(E)S-(S)S difference was 608 msec between means and the (S)E—(E)E difference was 578 msec. 
This meant that the relations predicted by the hypothesis were obtained. 

To assess other differences Newman Keuls (a posteriori) tests were run. Apart from significant 
differences between the conditions shown above, there was a significant difference between 
physical matches, (S)S latencies being faster (P< 0-01) than (E)E latencies, and name matches 
with an S test letter were faster than name match with an E test letter (P< 0-05) as would be 
expected on the assumption (see introduction) that E letters were more difficult. At the same 
time, the name match with the easier test letter, (E)S, was not faster than the physical (E)E 
match in which the test letter was the more difficult one, showing that the advantage of the 
easier test letter was lost under name matching even when compared to the physical (E)E match 
with the more difficult letters. 

Test type was not a significant effect. Separate subject group xexperimental conditions 
ANOVAs for ‘same’ and for ‘different’ pair responses confirmed that the effect of experimental 
conditions was substantially the same for the two types of test. For ‘same’ pairs, experimental 
conditions were significant (F= 12-02, d.f. =3, 30, P< 0-001), and pre-planned t tests showed the 
predicted significantly faster (E)E than (S)E latencies (P< 0-05) with a difference of 436 msec; 
and faster (S)S than (E)S (P< 0-005) latencies by 635 msec. For ‘different’ pair responses there 
was a similar effect of experimental conditions (F = 13-22, d.f. =3, 33, P< 0-001), and (E)E 
responses were significantly faster than (S)E responses (P< 0-01) by 718 msec, and (S)S 
responses were significantly faster than (E)S responses (P< 0-05) by 580 msec on Newman Keuls 
tests. The relations are shown graphically in Fig. 2. 


Discussion 

The results were consistent with the hypothesis that tactual features of verbal stimuli are utilized 
in memory. It is clear that the data could not be explained by naming alone. In successive 
matching the test letter has to be compared with a standard in memory. If this had been a name 
only, there would be no reason to expect faster latencies for S test letters when standards were 
in S than in E format, or for E test letters when standards were in E than in S format. The fact 
that significant differences were found in conditions of low error rates must mean that the 
physical format of the letter in memory affected the responses. 

There were some differences between the present findings and results of studies with visual 
material. In the latter, familiarity with the stimuli apparently does not affect matching time 
(Posner & Mitchell, 1967), while here E letters took longer. This could not be attributed to 
discrimination difficulties. The more difficult E letters were in fact larger. One possible 
explanation is that tactual shapes are not as easily apprehended in ‘global’ form as are visual 
shapes (Bamber, 1969; Krueger, 1973). This would also account for the lack of difference 
between 'same' and 'different' responses. At the same time, if familiarity is an important 
variable in tactual recognition, it might be argued that the results found here could be accounted 
for by the assumption that the response latencies depend not only on the familiarity of the test 
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Figure 2. Mean latencies (msec) for ‘same’ and ‘different’ judgements of successively presented Braille 
letters under physical match conditions in which the first and second letters were either in standard (S)S or 
enlarged, (E)E format, and under name match conditions in which the first letter was in standard and the 
second letter in enlarged format, (S)S; or the first letter was in enlarged and the second in standard format, 
(E)S. 


stimulus, but also on the familiarity of the standard in memory. Thus any, even successive, 
combinations which included an unfamiliar format might produce longer latencies. It should be 
noted that since physical E format and not the letters (names or shapes) were unfamiliar, this 
would imply an even stronger case for tactual effects. It would suggest that subjects judged on 
physical (tactual) features alone, without naming even under name match instructions. This is 
somewhat implausible. However, it was considered desirable to test letter matching with 
different physical formats when these did not differ in familiarity. 


Experiment III 

Familiarity of the physical format was controlled by using two formats of Braille letters which 
both differed from the standard format familiar to the subjects. The same instructions were used 
for all conditions: subjects were asked to judge whether the second (test) letter was the same 
letter as the first (standard). On the hypothesis that tactual physical features affect memory for 
Braille letters, it would be expected that matching is faster when standards (in memory) are in 
the same format as test letters than when standards in memory are in different physical format. 
On the null hypothesis that only names are held in memory, there should be no difference 
between any of the conditions when the formats do not differ in familiarity. 
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Method 

A subjects xlist type design was used with successive ‘same’ and ‘different’ judgements of pairs of Braille 
letters in two types of format, under two conditions in which pairs were in the same physical format, and 
two conditions in which they differed in physical format, as follows. 


Subjects. Subjects were 12 fluent Braille readers (six boys, six girls), mean age 10-7 (from 9:6 to 13:0) totally 
blind or with minimal light (no shape) perception from birth or under two years of age. Only one subject had 
taken part in the previous experiments. 


Materials. Ten Braille letters were prepared in two formats, designated X and Y for convenience, on 
standard Braille paper. For X-format letters a standard size hand-frame was used with a very fine stylus, 
which produced slightly smaller but more pointed (raised) dots. This had the effect of preserving the normal 
size of the letters, but increasing the distance between cells. For Y-format letters a slightly enlarged frame 
was used with a very thick stylus. This increased the size of letters from 6 mm to 8 mm, but maintained the 
same distance between cells as for the X-format. Same and different letter pairs were alternated in 
Gellermann order in vertical arrays, and exposed one pair at a time. Lists consisted of either XX, YY, XY, 
or YX pairs. The distance between letter pairs was set at 10 mm from the first cell of the first to the first cell 
of the second letter for all four conditions. The order of presentation of the four lists was counterbalanced 
across subjects. Each list consisted of ten same and ten different pairs. The first four exposures of each were 
treated as practice mins (not counted). 


Procedure. Subjects were tested singly in a quiet room of their school. They were instructed to read letter 
pairs from left to right using one finger of their (for Braille) preferred hand, without returning to the first 
letter, and working as fast as possible. Instructions stressed that subjects were to judge whether the second 
letter was the same letter as the first or a different letter. They were informed of the type of list prior to runs 
and told that differences in feel between letter pairs were not important and to disregard them. Latencies 
were timed by hand on a TC 15 timer from first touch of the test letter to the response (‘yes’, ‘no’). 


Results 


Error rates were low, 2-92 per cent overall, with 2-50 per cent for XX, 2:08 per cent for YY, 
2:5 per cent for YX and 4-58 per cent for XY conditions respectively. 

Mean latencies for correct ‘same’ responses graphed in Fig. 3 were subjected to a 
subjects xlist type (XX, YY, XY, YX) ANOVA. List type had a highly significant effect 
(F= 27-44, d.f. 23, 33, P< 0-001). Newman Keuls tests showed that XX and YY list latencies 
did not differ from each other, but each was significantly faster than either XY or YX latencies 
(P<0-01). These relations obtained for every subject. Mean XY latencies were slower by 
941 msec than XX and by 930 msec than YY latencies; YX latencies were slower by 535 msec 
than XX and 524 msec than YY latencies. The results thus show that while the two conditions in 
which pairs were in the same format did not differ, each was faster than either condition in 
which standards were in different format from test letters. In addition, YX latencies were 
significantly faster by 406 msec than for XY conditions. 

Subjects x test type ANOVA for mean correct latencies for ‘different’ responses, shown in 
Fig. 3 also showed a significant effect of test type (F= 11-14, d.f. =3, 33, P< 0-001). Newman 
Keuls tests found that XX and YY latencies did not differ from each other, but each was 
significantly faster than either XY or YX latencies (P< 0-01). Latencies for XY were slower by 
457 msec than XX, and by 524 msec than YY latencies; YX latencies were slower by 327 msec 
than XX and 394 msec than YY latencies. The latencies for XY and YX did not differ 
significantly from each other. The pattern of results was thus very similar to that for ‘same’ 
responses. It will be noticed from Fig. 3 that here latencies for ‘different’ responses were higher 
than for ‘same’ responses. The differences were significant for XX (P< 0-05) and YY (P< 0-01), 
but not for XY or YX conditions (two-tailed t tests). 
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Figure 3. Mean latencies (msec) for ‘same’ and ‘different’ judgements of successively presented Braille 
letters in either X or Y format in conditions in which an X test letter was preceded by an X or Y standard 
(XX, YX) and a Y test letter was preceded by a Y or X standard (YY, XY). 


Discussion and conclusions 


Experiment III showed again that matching successive tactual letters in the same physical format 
is faster than matching the same letters when the standard was in different format, and 
demonstrated further that this could not be explained by familiarity effects: There was no 
significant difference between judging physically identical letters in the two formats. The results 
thus support the hypothesis that tactual characteristics of letters in memory affect Braille letter 
recognition. 

The difference found for ‘same’ responses between XY and YX conditions requires some 
comment. This could not have been due to longer tactual search for Y letters, since Y-test letter 
latencies did not differ from X-letter latencies when the letter in memory was physically 
identical. However, the finding is easily explained on the assumption that Y letters took slightly 
longer to name than X letters, but could be matched as fast on shape alone. It was noticed that 
in both XY and YX 'same' conditions, a number of subjects responded with expressions like 
‘Yes; they are the same letter. They are both “Ks” (or “Rs” etc)’. In view of the material and 
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the instructions it would, of course, be surprising if naming were not involved. The main point 
here is that this could not have been the only factor since matching physically identical letters 
was faster. 

The relation between ‘same’ and ‘different’ responses was not a main issue here and will be 


discussed only briefly. No firm predictions were possible from a consideration of studies with 
visual material. The advantage of ‘same’ over ‘different’ visual shapes (Bamber, 1969; 
Nickerson, 1965) has sometimes been explained in terms of ‘nameability’ (Bindra, Donderi & 
Nishisato, 1968). However, since similar results have been found for stimuli which are difficult to 
name (Nickerson, 1975), it seems more probable that the difference depends on ‘global’ versus 
‘serial’ processing (Bamber, 1969; Krueger, 1973). For tactual raised dot shapes, serial rather 
than ‘global’ processing may be more often necessary or useful. This might well vary between 
individuals, or with altered interstimulus intervals. For instance, speed instructions and the 
smaller interval between successive pairs in Expt. III than in Expt. II could have elicited more 
‘global’ processing for identical shapes. In simultaneous matches (Expt. I), both stimuli were 
available for rechecking in any case. However, the factors involved in possible ' global looks' for 
tactual raised dot shapes are a separate problem which is currently being studied. 

In conclusion, the question that motivated this study was whether memory for physical tactual 
characteristics of verbal material could be utilized as a basis for matching. Findings showed that 
physical format affected not only simultaneous, but also successive letter matching by blind 
readers, and that ‘same’ responses for identical letters were faster than for successive letters 
differing in physical characteristics. These must, therefore have been coded in memory. The 
results thus support the hypothesis and suggest that Braille letter like visual letter matching 
involves memory for physical (tactual) characteristics. 
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Book reviews 


The Hierarchical Nature of Personal Illness. By G. A. Foulds. London: Academic Press. 1976. Pp. x+ 158. 
£5.80. 


It is fortunate indeed that this book was completed before its author's untimely death. Many of the ideas and 
data it contains have appeared in a series of journal articles but without the book the underlying unity of 
concept and purpose could easily have been lost. It is more clearly and concisely written than the earlier 
work, Personality and Personal Illness, and makes an excellent job of drawing the threads together. Those 
who are unsympathetic to the author's approach may see the book as a last ditch stand in defence of 
diagnosis and the medical model. In truth, however, it is neither defensive nor orthodox. 

The concept of personal illness and its ordered classification by severity, was presented in Foulds' earlier 
book. The new one is both a clarification and a development. The idea of a hierarchy in the classification of 
psychological illness is not a new one, but the author claims this as the first ‘systematic application of the 
idea to the whole range of personal illness’. In this he is undoubtedly right and, more important, he has 
amassed some hard data in support. The hierarchy is not merely an alternative taxonomy. It has important 
implications for reliability of diagnosis, for treatment and prognosis, and is potentially more useful than a 
‘disease entity’ model. Furthermore it contains the elements of a theory of psvcho-genesis from which 
testable deductions can be derived. 

Much of this was stated, or at least adumbrated, in the earlier book. In the present one, the reader can 
suddenly realize what the author had been getting at all along, but there is a lot of new thinking as well. For 
instance, there is a new look at the major theme of the earlier book, the relationship between personality and 
symptoms. As symptoms remit and patients move down the hierarchy into the less severe categories of 
illness, the personality variables extra-punitiveness, intro-punitiveness and dominance remain stable. The 
evidence on this point is now much clearer than it was in 1965 and poses an important question, because, 
despite the lack of change with improvement, extra- and intro-punitiveness are related to class of personal 
illness. Scores tend to be higher in patients who are higher up the hierarchy. Either the personality variables 
are affected by illness and take much longer to change with improvement than symptoms do, or they are 
indices of pre-morbid personality and good predictors of how far up the hierarchy a patient is likely to goin . 
the event of illness. In either case it is important to know and the question can only be settled by a much 
longer follow-up period. ‘Should it transpire’, writes Foulds, ‘that personality attitudes are illness-affected, 
they would still provide more fundamental measures of clinical improvement than do symptom measures 
alone. Whatever may be the success rate of depth psychotherapists, they have at least aimed at the right 
target.’ Therapists of all persuasions will find much food for thought here. 

Other points about therapy are more controversial. It is concerned, states Foulds, or should be, with the 
personally ill and their incapacity to enter into mutual personal relationships. Once this capacity is restored 
with the therapist’s help, ‘the patient passes outside the range of his expertise’. This raises the very 
apposite question of how pre- and post-illness personal relationships are connected with phenomena of 
personal illness. The trouble with some therapists (he seems to have Rogerians in mind mainly) is that they 
end up seeking clients who are hardly ill at all but ‘with whom their own needs can more nearly be fulfilled’. 
They are apparently unaware of the limitations they have imposed upon themselves and one of the 
consequences is their opposition to diagnosis. ‘Naturally if one confines one’s practice to a rather 
homogeneous group of, at most, mildly ill patients, diagnosis is relatively unimportant and it becomes a 
shade less ridiculous to think of aetiology as lying entirely outside the organism; for those who deal with the 
whole range of personal illness diagnosis and multiple aetiology are inescapable.’ 

There are some worrying aspects of the book. I do not think the author regarded his arguments as entirely 
proven. Certainly there are explanations of the hierarchical results other than the one Foulds proposes, 
though to have discussed them would have detracted from the admirable economy of presentation. His 
criteria for differentiating between personality and symptom measures, namely stability and distributions of 
scores, have not always seemed entirely adequate in the past and doubts have been only partially resolved. 
The results on stability are certainly much clearer now but one still wonders about the distribution criterion. 
An unqualified statement like ‘an attitude. . .is usually normally distributed in the general population’ 
ignores the fact that it all depends on how you construct the scale of measurement. The author was 
obviously aware of this and yet continues to imply that there is something fundamentally different in the two 
kinds of phenomena that lead to their having different distributions. The point can be argued his way but it is 
not a simple argument and it should have been stated more fully. 
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Points like this are relatively minor, however. The book rings true and it forms a base on which to build 
rather than a finished structure. A little careful checking and perhaps reinforcing of the foundations may be 
necessary before too much superstructure is added but Foulds has an active following and it is now up to 
others to carry on the good work. It presents a viable alternative to currently fashionable thinking on 
psychological disorder and personality deviation. Its message is not pushed too hard and its claims are 
commendably modest. It deserves to be read carefully and thoughtfully and it will be an influential book. 
JACK INGHAM 


Down’s Anomaly, 2nd ed. By G. F. Smith and J. M. Berg. Edinburgh: Churchill Livingstone. 1976. Pp. 
viit+348. £12.50. 


This is a very exciting book. The scholarship of the authors does not conceal the fascination of the story 
which they unfold, stranger than any science fiction. In all the recent knowledge explosion there are few 
aspects more dramatic than that of human biology, man’s knowledge of himself. Not many discoveries can 
have had such a key role as the demonstration of the additional 21 chromosome in ‘mongolism’. Down’s 
syndrome must have been with us since before the beginning of written history, yet it was not until 1866 that 
the first reliable account of the condition was published by Langdon Down. Penrose’s first edition of this 
book celebrated the anniversary of this event in 1966. The authors of the second edition point out that as 
much literature on the subject has appeared in the ten years since the first edition as in the previous century, 
so illustrating the exponential growth of our knowledge of ourselves. 

The title of Penrose’s classic contribution to scientific literature was The Biology of Mental Defect in which 
he brilliantly married the remarkable developments in contemporary biology with an understanding of human 
psychological potential. The first edition of ‘Down’s Anomaly’ may be seen as marking a major triumph of 
this movement and as signalling the end of the old metaphysics and dualism. The second edition records the 
solution of some remaining mysteries and mentions many of the problems which constitute a challenge to 
further decades. Already in his lifetime Penrose saw the vindication of his approach to the pathogenesis of 
Down’s syndrome since, in 1959, half a century after the description of non-disjunction in the evening 
~- primrose by Ruggles Gates, it was recognized that this was also the explanation of the occurrence of 
*mongolism'. Furthermore it was conclusively shown that the meticulous mathematical demonstration by 
Penrose of the relevance of maternal age was also related to the occasional failure of separation of the pair 
* of 21 chromosomes in oogenesis. 

With such a wealth of material in this rapidly developing field it is inevitable that some of the views 
expressed should be open to discussion. The authors mention work pointing to a general increase in the risk 
of a further child being affected by Down’s syndrome by a factor of 244. However, most of the evidence 
shows a skewing of this increased risk towards those younger mothers who have this misfortune. Any 
increased risk in the oldest mothers is therefore presumably slight and it would not be appropriate to 
multiply the risk to which they are subject by age by the same factor and for this reason they do not have 
the ‘bad’ risk of more than one in twenty mentioned by the authors. Similarly the suggestion (albeit 
tentative) that only 60 per cent of cases are age dependent is open to question. In most reported series there 
appears to be a greater incidence of regular trisomy and an absence of any particular aetiological factor. 
However, our ignorance of our own anatomy is such that we still do not have available basic information as 
to the incidence of gonadal mosaicism. 

BRIAN KIRMAN 


Obsessional States. Edited by H. R. Beech. London: Methuen. 1974. Pp. viiit+352. University Paperback 
Edition. 1976. £3.60. 


Obsessive-compulsive disorder is an uncommon but perennial problem which has long fascinated writers in 
psychiatry and psychology. Descriptively it was well documented by the turn of this century, and a wealth 
of literature appeared subsequently in diverse quarters. It is useful to be able to have access to writings 
through this book, which is a collection of essays on different facets of obsessional states. A problem is that 
the book is rather long in the tooth. It was published as a hardback in 1974, and two years later this 
paperback version appeared. Of more than 500 references in the index 99 per cent are from 1971 backwards. 
Omitted therefore is much of the therapeutic advance which in the last six years has transformed the outlook 
for compulsive ritualizers, i.e. treatment by exposure in vivo with response prevention, the latter possibly 
acting as an aid to prolonging exposure rather than as the necessary condition per se. Early documentation of 
progress in this area is in the chapter by Meyer, Levy & Schnurer. 

The book is subdivided into three parts. The first, on ‘Clinical and psychometic descriptions’, includes a 
scholarly review by Black of the natural history of obsessional neurosis. The second part concerns ‘Theory 
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and experiment’. It contains, amongst others, a careful exposition by Teasdale on experimentally produced 
behaviour in animals which might be relevant to clinical obsessions. Thereapeutic approaches are dealt with 
in the third and final part of the book. This includes the helpful report by Meyer and his colleagues, and 
Professor Cawley’s erudite exposition on the various forms of psychotherapy which are on offer. A final 
paper by Sternberg reviews physical treatment of the syndrome. For serious students this book can serve as 
a guide to the state of play in obsessive-compulsive neurosis in 1971. 

ISAAC MARKS 


Stress and Anxiety, vol. 3. By I. G. Sarason and C. D. Spielberger. New York: Wiley. 1976. Pp. xiv+365. 
£13.75. 


This book, the third in the series on stress and anxiety is based on papers given at a conference in Oslo in 
1975 on 'Dimensions of Anxiety and Stress' suported by NATO. There are three sections to the book which 
cover the biological dimensions of stress and anxiety, situational factors in stress and anxiety, and stress, 
anxiety and mental health. Each section is made up of chapters written by quite well-known authorities ın 
the various fields and can be considered as separate entities. The contributions range from reviews of the 
empirical literature (e.g. the chapter by Lawrence on the use of biofeedback for performance enhancement 
in stress environments) to methodological discussions (e.g. L. Laux on the multitrait-multimethod rationale 
in stress research) and to reports of investigations such as May & Sprague's study of stress as a predictor of 
family health. 

Each contribution is fairly substantial, self-contained and worthwhile in its own right and' most are very 
well written and illustrated. There is no visible contribution from the editors apart from the grouping of the 
15 chapters into three sections and a very short preface. In many ways this is a pity because the book is 
so diverse in its contributions that the reader feels a need for some linking material or at least some more 
extensive cross-referencing within the volume and to chapters in the three companion volumes. It is difficult 
to see who might buy this book, apart from libraries, because there can be relatively few people interested in 
the range of stress-related topics covered. Indeed there must be precious few university courses which 
attempt to cover all of this material. It is hardly likely that the mythical ‘general reader’ in psychology will 
be motivated to read such a large and technically detailed, albeit readable, volume. 

The book contains at least one example of the way in which someone who is very determined to stick 
close to the empirical evidence in his own area will unthinkingly abandon this scientific approach when 
trespassing in another area, even though this area is represented, empirically, between the same covers. 
Jeffrey Gray begins his precisely referenced and carefully backed up statement on the neuropsychology of 
anxiety by saying that alcohol is used extensively to control anxiety in man. This is one of the few 
statements in his contribution that Dr Gray does not reference. Several chapters later, G. Alan Marlatt 
writing on ‘Alcohol, stress and cognitive control’ reviews a lot of evidence relevant to Gray's hypothesis 
He does not find any support for the idea that people drink to reduce anxiety or tension, or even that 
alcohol does this incidentally in moderate doses. In fact alcohol may be used to deliberately increase 
physiological arousal, and that any anxiety-reducing properties it may have are cognitive and not 
pharmacological features of the drug. 

Apart from these small inconsistencies and the lack of an editional contribution the book must be highly 
recommended as a work of reference for its pleasantly authoritative but clear expositions, its considerable 
scope and its copious references. It is an essential purchase for every departmental or university library and 
for those individuals who teach or research in stress and anxiety. 

RAYMOND COCHRANE 


Sleep, Nutrition and Mood. By A. H. Crisp and E. Stonehill. Chichester: Wiley. 1976 Pp. x+173. £6.95. 


The major theme of this book, developed in the five introductory review chapters, is that sleep is associated 
with anabolic, digestive and restorative processes, and therefore with repletion and gluttony. It seems that 
fat people sleep more, and report less anxiety and depression than controls, while, in animals at any rate, 
there is a clear association between food deprivation and increasing alertness and activity. In the short term, 
nutritious drinks such as hot malted milk taken at bed-time improve the quality of sleep, especially in the 
second half of the night. This theory, or proposed relationship between sleep, nutrition and mood, is 
essentially correlational and descriptive. There is not much discussion of the physiological processes causing 
the correlation, or any grand theory of the direction of causality. People may become fat simply because they 
sleep more, and are therefore inactive and use up little energy. The authors acknowledge this, but prefer to 
believe there is more significance in the correlation than anything so trite. In this they are almost certainly 
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correct, but it is a pity that they do not make their assumptions more explicit, and perhaps pursue the 
implications of their theory in greater depth. 

The first of the two experimental studies, dealt with in the remaining nine chapters of the book, is an 
investigation of the sleep of anorexics, during and after treatment, and of the sleep of patients under 
treatment for obesity. They found that anorexics slept little and fitfully, but, when putting on weight, they 
slept more, and with fewer interruptions. Also, there was a change in the proportion of different stages of 
sleep (defined in terms of EEG) during remission, so that there was a massive increase in the amount of 
deep slow wave, stage IV sleep at the expense of light, stages I and II slow wave sleep. The sleep of the 
obese was much more variable, and showed no such straightforward changes with weight loss, as the 
anorexics had with weight gain. 

The second study was a large-scale survey of 375 patients referred to one out-patients’ clinic over a period 
of three years, involving patient questionnaires, consultant questionnaires and physical measures of height, 
weight, and skinfold thickness ( a measure of ‘fatness’). The patient questionnaires were, in part, 
retrospective, so that estimates could be made of changes of sleep patterns and weight since the onset of the 
current illness. In general, they found that psychiatric illness was almost invariably accompanied by some 
reported disorder of sleep (usually difficulty in getting off to sleep). Weight change was a common feature, 
but weight gain was almost as frequent as weight loss. There was a tendency for weight loss to be associated 
with less sleep, and weight gain to be associated with more sleep, within diagnostic categories, as the general 
theory would predict. However, probably the most interesting outcome of this study was that there was no 
discernible difference in sleep patterns between endogenous and reactive depressives, despite the fact that 
two of the four consultants involved apparently used the patients’ reported sleep habits as an important 
symptom in diagnosis, and all four were required to diagnose in terms of endogenous and reactive depression, 
rather than in terms of severity of symptom. There was also no relationship between diagnosis of 
endogenous depression and pyknic body build. 

Neither of these studies is definitive, but they are both valuable and interesting contributions. It is to be 
hoped that their collection and publication in this book does not mean that no more work is to be done in 
this area by these two authors. In particular, it seems important to establish, in the EEG laboratory, that the 
sleep disorders of anorexics can be induced in the dieting obese, if not in normals, and that these symptoms 
can be reversed by the administration of normal amounts of food. The introductory review chapters are 
comprehensive and thorough, summarizing a very diverse literature, scattered through a wide range of 
journals. They alone would make the book worth buying, both by the clinician and serious student. 

JAKE EMPSOM 


Hypnosis and Behavior Therapy. Edited by E. Dengrove. Illinois: Charles C. Thomas. 1976. Pp. xxi+406. 
$26.75. 


Biofeedback, Behavior Therapy and Hypnosis: Potentiating the Verbal Control of Behavior for Clinicians. 
Edited by I. Wickramasekera. Chicago: Nelson-Hall. 1976. xv--591. $18.95. 


Behaviour therapists tend to be rather sniffy about hypnosis. Some, including Wolpe, use hypnotic induction 
procedures to supplement relaxation training in a minority of patients; most either ignore it or dismiss it. It is 
easy to see why this might occur. The development of clinical hypnosis has not been noted for its 
intellectual or experimental rigour and has largely escaped controlled investigation. Behaviour therapists 
frequently lay great stress on the theoretical and experimental basis of their analytic and therapeutic 
techniques and are certainly not going to rush to adopt someone else's mumbo-jumbo. These two books 

seek to encourage a rapprochement between behaviour therapists and hypnotherapists, to the presumed 
benefit of both. 

The volume edited by Wickramasekera (which includes a section on biofeedback — it does pop up 
everywhere doesn't it?) is curious. It is large (600 pages) and 50 per cent of the articles are reprints of papers 
by the editor. Unhappily this immodest approach fails. Dr Wickramasekera is clearly an energetic and 
creative clinicign but most of these papers are preliminary and repetitive and do not justify such massive 
reprinting. He also includes specially written introductory articles for each section, but they are diffuse and 
fail to unite the disparate topics he covers. Some papers not written by the editor are included; mostly 
well-known papers, some of them of interest, including papers by Seligman on preparedness and 
helplessness and Rosenhan on institutionalization. Part of the interest in these papers lies in guessing why 
they were included in this volume. Not a book to buy but perhaps worth browsing through in your favourite 
book store when the books you really want are out of stock. I particularly recommend the chapter on 
' Adversive behavioural rehearsal' (ABR) in which the author confides that it has been described as insane 
and then adds ‘which it probably is in some respects’. 
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The Dengrove volume is more modest - only two articles by Dengrove ~ and more successful. Using a 
mixture of reprints and original article he systematically attempts to give an outline of behaviour therapy and 
hypnosis, the relationship between hypnosis and conditioning, the application of hypnosis in behaviour 
therapy and vice versa. Inevitably the articles vary enormously, both in quality and relevance, but there is a 
reasonable number of articles of interest to behaviour therapists and in particular there are articles which 
they are unlikely to have come across. The articles by Spanos, De Moor and Barber on common features of 
behaviour therapy and hypnosis and Weitzenhoffer on the similarity of techniques used by therapists of both 
persuasions are particularly interesting and suggest that it might be fruitful to investigate the similarities of 
the processes operated in both therapies. It was also intriguing to learn that covert sensitization and 
reinforcement procedures recently introduced into behaviour therapy have been, albeit unsystematically, 
used by hypnotherapists for up to 70 years. Unfortunately, as in so many areas, it is only when behaviour 
therapists became interested in the technique that serious research has been done on them so this 70 years’ 
history is of little scientific value. 

In addition to these useful bridge-building articles some authors have taken the opportunity afforded by the 
book format to describe in more detail than is usually possible the procedures, measures and additional 
literature that patients are given during various treatment procedures, e.g. Rothman, Carroll & Rothman 
describe a very comprehensive programme of homework assignment and self-help aids that seem well worth 
study by behavioural clinicians even if they do not include hypnotic techniques in their practice. 

It is easy for me to believe that the addition of behavioural techniques will aid hypnotherapy. I am less 
convinced that the oppposite is true. As several authors in Dengrove point out there is little evidence on the 
efficacy for the various hypnotherapeutic techniques and much of the recent experimental work on hypnosis 
minimizes its uniqueness, and suggests that many aspects of these techniques are already in use in standard 
behaviour therapy. Despite this, systematic study of the interaction of hypnosis and behavioural procedures 
may improve the efficacy of behavioural procedures or increase our understanding of the processes operating, 
e.g. by increasing the vividness of aversive imagery in covert sensitization or by deepening our 
understanding of what is currently termed non-specific factors. In the meantime the hard-pressed clinician 
may be reassured to know that if your patient requests hypnosis Lazarus has shown that it costs nothing to 
add to standard treatments and may, at the 10 per cent level, produce additional benefit. Go on, chance a 
Type I error. 

DEREK JOHNSTON 


Cycles of Disadvantage. By Michael Rutter and Nicola Madge. London: Heinemann Educational. 1976. Pp. 
413. Cloth, £6.50; paper, £2.50. 


The book Cycles of Disadvantage by Rutter & Madge was commissioned by the DHSS/SSRC Joint Working 
Party on Transmitted Deprivation to be a review of the research findings on deprivation and to consider 
possible mechanisms underlying similarities between parents and their children. The interest in this area was 
stimulated in 1972 by Sir Keith Joseph, then Secretary of State for the Social Services, who wondered why, 
in spite of general economic advance, many people still live in poverty, and why the same families appear to 
use the Social Services generation after generation. The book considers many kinds of disadvantage, 
personal, social and ‘structural’ (e.g. economic), and the multiplicity of variables which influence them. This 
is a comprehensive review of the relevant literature, but does not go beyond a welter of facts into much 
discussion of the implications of the findings. Also, the style in which it is written is limiting, in that it 
subsumes a variety of ideas under jargon words like ' intergenerational continuity’ or ‘parenting’. 

Chapter 1 is an introduction to the conceptual issues and methodological considerations involved, and to 
the range of topics to be discussed. The next two chapters are concerned with economic status and housing 
and the extent to which people remain as poor and badly housed as their parents. There are some indications 
of cycles of poverty (but much more evidence for cycles of wealth, and the transmission of large fortunes 
from one generation to the next!). They point out that any cycles of housing disadvantage are most likely to 
be mediated through family circumstances such as low income or large or single-parent families or as a result 
of the particular area of residence which may have much substandard housing. The conclusions to both these 
chapters are that ‘little is known’ about intergenerational continuities but that government policy could be 
particularly effective in reducing or even eliminating the disadvantage that undoubtedly exists in these areas. 

The following five chapters are concerned with the more personal variables of intellectual performance and 
scholastic attainment, occupation, crime and delinquency, psychiatric disorder and parental behaviour. The 
chapter on intellectual performance includes a useful summary of the controversial research on genetic and 
environmental influences on intellectual development as well as pointing to the small, but important effects 
of such variables as malnutrition. 
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In chapter 5, occupational status is considered in depth as it is associated with so many aspects of life. 
They point out that social mobility is high, although there is less mobility in the professional classes. Like 
wealth, high occupational status is effectively inherited. It is not possible to draw any firm conclusions from 
the data about the extent to which the chief limits on social mobility are personal, social, structural or 
political. 

Chapters 6 and 7, on ‘Crime and criminality’, and ‘Psychiatric disorder’ respectively, share the same 
uncertain conclusions. For each kind of disadvantage, certain types show continuity (e.g. persistent 
criminality, personality disorder and alcoholism) and these are associated with family problems and social 
deprivation. 

Parenting behaviour (chapter 8) shows strong continuities in some aspects. Family size and quality of 
marriage show some stability between generations but the most striking continuity is for abnormal parenting 
such as child abuse. 

Then there are two chapters on minority groups with special problems. These are ‘Multiple problem 
families’ and ‘Ethnic minorities in Britain’. The proportion of multi-problem families in Britain today is 
estimated to be at least 1 in 20, although when administratively defined in terms of long-term social 
dependence, the figure falls to less than 1 per cent. The inheritance of disadvantage is more obvious for the 
sons than the daughters of problem families. Their unsurprising conclusion on ethnic minorities is that these 
people certainly suffer from certain disadvantages, partly as the result of prejudice but little is known about 
continuities. 

The final chapter is a discussion of conceptual issues, bringing together the various forms of disadvantage 
through the variables which influence them. Methodological problems are discussed, followed by a 
description of the complexity of causal processes. The general importance of family factors is emphasised 
and so are the striking discontinuities which are evident in every area. 

* At least half of the children born into a disadvantaged home do not repeat the pattern of disadvantage in 
the next generation.' Perhaps a profitable research topic in the future would be to study those who break out 
of cycles of disadvantage. 

This is a very well-compiled and systematic review, and the chapters on ' Parenting behaviour' and 
* Psychiatric disorder' are particularly readable, perhaps reflecting Rutter's own interests. Within each 
chapter one is taken in rational order through the evidence and it is clear for which types of disadvantage 
there is strong evidence of ‘intergenerational continuities’ and for which the evidence is weaker. However, 
at the end of it all, it is largely left up to the reader to form his own hypotheses about the interactions 
between variables and their relative influences. As Rutter himself points out, we are greatly in need of 
research to test alternative hypotheses in this area. There is so much information and much less connecting 
it. 
This research is of political significance but little is said about this. Rutter has taken great care to be 
unbiased but has also failed to discuss the implications of the work. It is surely up to scientists to accept 
some social responsibility for policy implementations based on their findings. 

I can recommend this book only as a review of the literature and not for a discussion of the social or 
political implications of the research. 

JANET EMPSOM 


Child Guidance and Delinquency in a London Borough. By D. Gath, B. Cooper, F. Gattoni and D. Rockett. 
London: Oxford University Press. 1977. Pp. xiii-- 189. £7.00. 


This book is the 24th report in the Maudsley Monograph Series. It describes a retrospective investigation of 
ecological differences in the official rates of child guidance referral and delinquency in the London Borough 
of Croydon, over a five-year period, 1962-6. 

The authors' major concern is that research in child psychiatry has concerned itself, primarily, with 
factors within the family unit, neglecting for the most part, the wider social environment of the child. They 
hope their study might go some way to rectify this imbalance, and to help develop 'an ecology of 
childhood’. 

The study is concerned with the influence of the child’s neighbourhood, school, and general practitioner on 
the official rates of child guidance referral and delinquency. Both child guidance referral rates and 
delinquency rates were shown to differ between different neighbourhoods (as defined by electoral wards and 
clusters of enumeration districts). Child guidance referral rates showed a positive association with 
delinquency rates, and both rates were related to neighbourhood social indices: type of housing, density of 
population and social class distribution were revealed as the most important. A cluster analysis indicated no 
significant effect on child guidance referral rates for distance (of the child’s home) from the child guidance 


Book reviews 395 


centre, or for social class (the social class distribution of the child guidance referrals reflected that of their 
neighbourhoods, but in neighbourhoods with a high proportion of social classes IV and V, the absolute 
number of referrals was higher). As with neighbourhoods, child guidance referral and delinquency rates, 
co-varied between different schools. In Croydon during the time of the study, selection still took place at 11 
years, and both rates differed between the main types of schools (primary, selective-secondary and 
non-selective secondary). Though these differences showed some association with class-size, turnover of 
teachers, and proportion of immigrant children, within each type of school, these indices were not associated 
with either rate. The influence of the school on both rates was shown to act independently of any 
neighbourhood influence. Child guidance referral rates varied among the GPs who replied to the authors’ 
questionnaire (76 per cent), but the authors failed to identify any factors (in the doctors’ background, 
training, or attitude to child psychiatry) that were associated with these differences. 

Though the title might suggest otherwise, this monograph has very little to offer the student of 
delinquency. It is already well known that delinquency rates differ between neighbourhoods and schools, but 
unfortunately the study offers little help in identifying those characteristics of a neighbourhood or school 
which cause its high/low rate of delinquency. 

This monograph is perhaps of most value to those concerned with the patterns of use of existing child 
guidance services. The reasons for the study are described on the dust cover thus, ‘As resources of both 
money and manpower are likely to remain scarce, it is important to examine the ways in which existing 
facilities are being used. Moreover, a scientific approach to prevention will require much more information 
about the environmental factors of maladjustment and about groups of children that are exposed to risk.’ 
The book has undoubtedly contributed to research in the use of existing facilities, but its failure to consider 
actual prevalence rates as opposed to referral rates, means it has provided little, real ‘information about the 
environmental factors of childhood maladjustment. . .’. 

Though the authors suggest in their preface that studies such as theirs ‘. . .may be valuable. . .in helping 
to formulate working hypotheses and to select contrasting areas for comparative field-survey ', one is left 
wondering whether, without the facilities offered by the Maudsley Monograph series, the results of such 
studies, necessary as they are, would be presented to the reader in a book costing £7.00, rather than in a 
series of journal articles. 

DAVID LUCEY 


Acoustic Phonetics: A Course of Basic Readings. Edited by D. B. Fry. Cambridge: Cambridge University 
Press. 1976. Pp. 469. £8.75. 


This is a collection of 31 papers and extracts dealing with properties of speech waves and the way they 
relate to perception. These papers are grouped into five major sections: 

Part 1: Acoustics of the Speech Mechanism (pp. 19-74: 3 readings) 

Part 2: Acoustic Analysis of Speech (pp. 75-176: 7 readings) 

Part 3: Acoustic Cues in Speech (pp. 177-352: 13 readings) 

Part 4: Investigation of Prosodic Features (pp. 353—441: 7 readings) 

Part 5: Speech Synthesis by Rule (pp. 443-466: 1 reading) 

The earliest paper is Paget's ‘Vowel resonances’ (1922) and the most recent is Lisker & Abramson's ‘The 
voicing dimension: Some experiments in comparative phonetics' (1970). The dates of the others range from 
1940 to 1965, with the vast majority (25 readings) coming from the period 1951 to 1962. The aim is, then, not 
to give a comprehensive history of the acoustics of speech (this would call for the inclusion of a number of 
considerably earlier papers), nor to give an up-to-the-minute account of the speech research going on at the 
present moment. It is, rather, to show the lines along which acoustic phonetics has developed in the last 20 
years, and why and how these developments have taken place rather than any others: ‘The papers reprinted 
here cover much of the history of the subject and many of them have been deliberately chosen because they 
present the rationale of what has been done as well as its results’ (Preface, p. 91. 

The relation of the subject-matter of this volume to the concerns of psychologists is clearly stated by 
Professor Fry himself in his Introduction: 

* Acoustics in its wider usage refers mainly to the study of the generation and measurement of sound 
waves, and acoustic phonetics necessarily makes use of the physical methods of acoustics for similar 
purposes, that is, in order to establish the properties of the sound stimuli which speech presents to the 
ear. To extend the inquiry to the perceptual effects of these stimuli and their relation to language 
working, it employs methods which are very largely those of experimental psychology...’ (p. 12). 

The papers of Part 3 (and some of those in Part 4) are perhaps the most relevant here. Part 3 begins with a 
short general section including Fletcher's ‘Effects of filtering and masking’ (an extract from his book Speech 
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and Hearing in Communication) and Miller & Nicely’s well-known paper ‘An analysis of perceptual 
confusions among some English Consonants’. The next short section (‘Speech synthesis’) sketches the 
‘how’ of the method by which so much progress has been made in the field of speech perception. The bulk 
of Part 3 (the section entitled ‘Perceptual experiments’) consists of accounts of the application of this 
method in some of the classic experiments performed in the 1950s at the Haskins Laboratories. Part 3 
concludes with a section entitled ‘Perception and linguistic categories’. 

The observant reader will be struck by the close resemblances between this book and the Penguin Modern 
Linguistics Readings series; not only is the general scheme similar, but the detailed layout and typography 
are identical. This is no coincidence: this collection reached page-proof stage in the Penguin series before 
being dropped by that publisher. The students of linguistics and phonetics for whom it is primarily intended 
(see Preface, p. 9) might have bought the book had it appeared in paperback form: it is hardly likely that 
many of these will feel like parting with about five times as much money for this hardback version. Another 
problem arises: the hardback is in direct competition with Ilse Lehiste’s Readings in Acoustic Phonetics 
(1967), and, what is more, 10 of its 31 papers also appear in that volume (at least one in a fuller version) - 
these are Readings 3, 4, 6, 8, 10, 12, 17, 18, 24 and 31. Two more (readings 26 and 30) are also included in 
Dwight Bolinger’s Intonation reader in the Penguin series. It is a pity this overlap could not have been 
reduced; its value as a reference work for libraries would have risen correspondingly. 

One weakness of the book is the disappointing ‘Further reading’ section (p. 467): one would have hoped 
for an annotated bibliography rather than a bare list of seven titles. The history of the subject is well catered 
for by the works cited, but one would also have liked to see references to some work done since 1970. 

However, these minor disadvantages do not detract from the value of the book: Professor Fry can be 
confidently recommended as a knowledgeable and reliable guide to the present state of the art of speech 
research. 

ERIK FUDGE 


and Perception. By George A. Miller and Philip N. Johnson-Laird. Cambridge University Press, 
1976. Pp. xii+760. £12.50. 


If there were such a goal in the cognitive sciences as the Unified Field Theory in the physical sciences, it 
would presumably be that of relating language, thought and perception. The authors of this impressively 
weighty volume stress that their approach to part of this problem will be psychological, rather than 
philosophical or linguistic (though, if successful, their work could have immense implications for other fields 
of study). The task they set themselves is ‘to formulate a general theory relating language and perception, a 
theory adequate to support empirical and experimental investigations of scientific merit’ (p. 2). An early 
operational decision they make is to restrict themselves to words (or perhaps morphemes, despite the 
admitted difficulties of definition of either term), and ‘how people’s knowledge of words should be 
characterized’ (p. 3). 

The first two chapters are concerned with surveying the processes of perception and sensation, and 
expressing them in terms of an analytic semantics whose chief tool is the predicate calculus. Thus, 
perception of sensory attributes is represented by attentional-judgemental predicates, such as Magn (x, y), 
i.e. ‘x has the magnitude y’. That such a formula represents the result of a perceptual judgement, and not 
the process of judging itself, the authors are perfectly well aware: they do not have access to any explicit 
perceptual theory (pp. 35 sqq.), so they have no basis, except for very general notions of comparative 
simplicity or complexity of concepts, for deciding what the primitives of the system might be — a familiar 
problem to semanticists. In several places, though, the authors express the admirable, and, I believe, the 
only practicable sentiment that ‘it is better to build on sand than on nothing at all’ (p. 38): in this way, it is 
at least possible to advance hypotheses for theoretical and empirical testing — and sand which is subjected to 
enough pressure will, it is hoped, eventually compact into rock. Cognitive science, unlike any of the physical 
sciences except perhaps astronomy, is necessarily, for a large part of its coverage, a bootstraps operation, 
and it is well to recognize that research in it can be neither purely inductive nor yet purely deductive. 

On pp. 113-115, Miller & Johnson-Laird present a list of the predicates they are postulating as primitives. 
They include static attributes such as Cnvx (x), ‘x is convex’, and Red (x); processes such as Chng (x), ‘x is 
changing’, Deform (x), ‘x is changing shape’, and Rot (x), ‘x is rotating’ (all of which are presumably linked 
hierarchically in various ways); and relations such as Above (x, y), ‘x is higher than y’, Inline (x, y, 2), ‘x, y, 
and z lie in a straight line’, and Simlr (x, y), ‘x is perceptually similar to y’. The last set, whose notation 
shows them to be more complex than the others (since they are several-place predicates), also reveals the 
multivalency of the argument symbols (i.e. the italic letters). Miller & Johnson-Laird recognize this to some 
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extent by using different letters for some broadly distinct types of argument (e.g. e for event and t for time), 
but the lack of distinctiveness of x, y and z is a problem they share with most logicians and semanticists. 

Chapter 3 forms a bridge between the perceptual and the linguistic by reviewing the relations between 
objects and the words which name them, or more specifically, between the perceptions of objects and the 
neuromotor processes of speech. This middle ground is no less than the whole conceptual apparatus of the 
human mind, so the authors confine their scrutiny to that part underlying the lexicon of perceptual 
phenomena. They present a hypothesis towards a theory of conception based on a procedural approach: ‘we 
assume that understanding the meaning of a sentence depends on knowing how to translate it into the 
information-processing routines it calls for’ (p. 129). They are careful to point out, however, that the 
‘procedure for identifying a dog is not. . .the same as our concept of dogs’ (ibid.). Their programme is based 
on work in artificial intelligence, so for all we at present know, it may be merely an analogue of the actual 
process of identification and so on. The routines may be translated into complex propositions of predicates 
and arguments, of a structure identical with those used for the attentional-judgemental operations (which 
form substructures of the procedural routines). 

The last 500 pages of the book (chapters 4-7) concern language more than perception. Chapter 4 examines 
the procedure of labelling. In doing so, it assumes — wrongly, in my opinion - that words are labels for 
objects. Even if this is only loose talk, and ‘objects’ means ‘object percepts’, or the ‘concepts of objects’, 
(and in actual practice Miller & Johnson-Laird do not describe such a simple relationship), it seems clear that 
words occupy no special status in regard to objects, or their percepts or concepts, but that they are merely 
convenient abbreviations for the structures of meaning (whatever these consist of) into which we divide the 
conceptual universe. For this reason, we can accept as synonyms (or partial synonyms, etc.) words and 
equivalent linguistic expressions (hence the basis of most dictionary definitions). To be fair, I think that this 
is what the authors intend to postulate when they organize the concept of TABLE into a schema of 
attentional-judgemental-cum-conceptual predicates: 


TABLE (x): 
(i) THING (x) 
(ii) MOVABLE (x) & CONN (x) & RIGID (x) 
(iii) PPRT (x, WORKTOP (y)) 
(iv) PPRT (x, SUPPORT (2)) & SUPPORT (z, y) (p. 233).* 


They certainly devote the remainder of the section in which this falls to discussing the problem: they refer to 
the interrelationships between such schemata (i.e. the ‘conceptual universe’ referred to above), and also to 
the fact that neither objects nor words are meaningful unless they conjoin percepts, routines of identification 
and classification, and concepts. In logical terms, this could be translated into the requirement that a 
semantic theory should have a model theory relating it to its extensions in reality. ‘The meaning of a 
sentence is not given solely by the routine into which it can be translated in particular situations: it must also 
have a place in a larger system of knowledge and belief’ (p. 696). However, it is also at about this point that 
the authors develop, as ‘a final theoretical revision’, the notion of a lexical concept, which is eventually 
defined as ‘anything capable of being the meaning of a word’ (p. 696). By this, they mean not some Platonic 
Form, but for most words a definitional component (which is recognitional) and a connotative component 
(which comprises associational knowledge). For other words, such as deictics, whose reference constantly 
shifts, the lexical concept would be procedural. To forge a link between semantic-cognitive structures and 
perceptual-psychological correlates seems an admirable and necessary undertaking; but I would still take 
issue with the authors' decision to base their work on words. Words do not 'have' meaning: they are 
associated with it. Meanings (semantic structures) are logically prior, and it is with these that a much more 
satisfactory one-to-one link with conceptual structures may be constructed. 

The remainder of the book (amounting to over half of its length), scrutinizes specific aspects of the 
conceptual universe. Chapter 5 looks at the notions ‘property’ and ‘relation’ in respect of colour and 
kinship systems. Chapter 6 investigates 'Some fundamental concepts', namely spatial, temporal and causal 
relations. Chapter 7 closely studies some of the semantic fields of English verbs. specifically those of motion 
(historically the starting point of the whole enterprise, evidently), possession, vision and communication 
(which last set leads inevitably into illocutionary force and pragmatics). Also in chapter 7, the authors 
attempt to characterize the notion 'semantic field', in order to comprehend the intentional meaning of lexical 
items. To do this, they distinguish between lexical concepts (see above), and ‘core concepts’, i.e. a set of 

* CONN (x) = 'x is connected’ (i.e. 1s, presumably, an assemblage); PPRT (x, y) = ‘x has the proper part 


y’. Small capitals denote ‘lexical concepts’, viz. ‘anything capable of being the meaning of a word’ (p 696) 
(see below). 
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shared conditions which are necessarily or contingently fulfilled by every term in the field, ‘thought of as lay 
theories, serving to organize a person’s knowledge and beliefs’ (p. 700). 

It is, of course, rather presumptuous to sum up a book of more than 700 pages in little more than the same 
number of words. As a summary of current and past research into cognition, and a programme for the 
representation of cognitive routines in a notation that can be compared with logical notations of semantic 
structure, Language and Perception is clear, stimulating and unique. As a statement, or at least an 
incorporation, of current linguistic research, particularly in generative semantics, it is, I think, less 
successful (and an important omission is the work of Jim McCawley). This is chiefly because the notion of 
structure in semantics appears to be outside the framework which Miller & Johnson-Laird have chosen 
(though it is central to the work of George Lakoff, whose major papers they cite, and McCawley, who 
receives just one mention). This makes it difficult to relate the conceptual schema of chapter 4 to the type of 
structures assumed by generative syntacticians, for instance. However, this is rather a cavilling objection to a 
book which is bound to become a classic, and if it were not for the rather steep price (which nevertheless 
works out at a mere 2p per page, comparing very favourably with other publications, particularly foreign 
ones, in these parlous times), I would recommend it unreservedly to psychologists and linguists alike. For 
libraries and those with private means, however, it cannot be too strongly urged that Language and 
Perception is an essential addition to any collection. 

PAUL WERTH 


Organizational Communication, Behavioral Perspectives. By J. W. Koehler, K. W. E. Anatol and R. L. 
Applbaum. New York: Holt, Rhinehart and Winston. 1976. Pp. xii+276. (No price given on this copy) 


The authors of this book define an organization as, ‘...a structural system of relationships that coordinates 
the efforts of a group of people toward the achievement of specific objectives’. That seems to be a fine, 
broad, definition which could include not only a football team, but also, for example, reviewer, editor, 
publisher, printer, distributor and readers. The potential is there, but the book stays rather close to industrial 
and business organizations and rarely ventures to the frontier-lands suggested by the authors’ definition. 
Accepting this limitation, the authors do meet their own organizational objective of preparing a book, 
‘...that would examine the processes of human behavior and communication within the organizational 
context; one that would reflect the dynamic, all encompassing role of communication in effective and 
ineffective organizations’. 

Chapter 1 is a peculiarly redundant restatement of the preface and the book proper begins in the second 
chapter with a clear, if superficial, statement of the tenets of Scientific Management as elaborated by Taylor, 
Weber and Fayol. Scientific Management, with its atomistic ‘economic man’ controlled by hunger, fear, and 
greed, lingers on into the last quarter of this century, vide press treatment of industrial affairs. But amongst 
social scientists the approach has fallen from grace, rocked by Taylor’s study of ‘the science of shoveling’ 
(he greatly increased the efficiency of the Bethlehem Steel Company, but at the cost of 260 out of 400 jobs), 
and toppled by the implications of Elton Mayo’s classic ‘Hawthorne’ investigation. Toppled because Mayo’s 
work emphasized the importance of a factor missing from Scientific Management - ‘the human factor’. 

Mayo became the father of human relations (back in those early days everybody seems to have been the 
father of something) and the importance of communication began to emerge, first as a tool of management, 
then as an essential aspect of organizational life. Koehler et al. systematically review different types of 
communication (up, down, and horizontal) and their integration which produces the complexities of 
‘organizational climate’. Thereafter the text moves through major theories, personality types, needs, power, 
roles, leadership, and so on. The final four chapters bring it all back home in analyses of communication 
problems, decision making, and conflict resolution. 

Organizational Communication maintains a steady pace and a level of analysis well suited to a minor 
option in a psychology degree course; although it breaks no new ground the book, by taking communication 
as an integrative theme, gives a fresh aspect to a major portion of industrial psychology. The inclusion of 
real and hypothetical examples — to motivate reader interest — is the source of minor irritation since even 
some of the fictional episodes are less than apposite. But taken as a whole the book is well written and 
informative, mainly bland, never hysterical; many readers will find it a useful introduction to, or, reminder 
of, an important area of applied psychology. In this book I met, for the first time, the 'Pollyanna-Nietzsche 
effect’ which is a state of mind combining optimism and the feeling of invulnerability sometimes shared by 
members of a decision-making group. It is a pleasingly named concept which carries well to other areas and 
suggests that industrial psychology, as a discipline, is advancing nicely from the days of ‘Speedy’ Taylor. 
Practitioners must feel some frustration at the failure of industry and media to keep up with them. In the 
meantime I nominate E. E. Carter as the father of the ‘Pollyanna—Nietzche effect’. 

RAY BROWN 
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Task and Organization. Edited by Eric J. Miller. London: Wiley. 1976. Pp. xviii+379. £7.75. 


This collection of essays on and contributions to the ‘open-systems’ approach to organizational behaviour is 
the second volume in the Wiley series on individuals, groups and organizations. In all 23 authors contribute 
to 17 substantive chapters, which themselves are arranged in three sections: ‘Organization and the 
individual’, ‘Organization for tasks’, and ‘ Approaching organizational change’. 

A key figure in the development of the open-systems approach was A. K. Rice, who died in 1969. This 
collection is in large measure a tribute to his work and ideas, and in a broader sense is a tribute and addition 
to the work and inspiration of the Tavistock Institute. Rice’s last paper is included in the volume, though it 
has previously been published. 

As befits, rightly or wrongly, the task of the book it cannot be seen as a critical work. Indeed in the 
introduction Miller adopts a defensive stance as if to neutralize in advance any criticism of the open-systems 
approach. He suggests that such criticisms would be difficult to evaluate since we tend not to recognize the 
different vantage points from which such criticisms are made (e.g. psychology and sociology). But surely a 
common criticism of the types of organization theorist subsumed under ‘open-systems’ is that it is they who 
confuse the points of view — not the critics - as when conceptually incoherent and empirically unsupported 
shifts are made from structural to psychological levels of explanation, or when problems of understanding 
organizational behaviour get defined as managerial problems of recognizing and controlling systems 
boundaries. Such emphases, common throughout the book, are surely myopic when one considers the 
perspectives of other interested parties; workers, academics, researchers, government agencies, etc. 

Miller’s attempt to brush aside criticism by appealing to a multirole perspective on organization theory is 
further reflected in Tom Lupton’s contribution where he adopts an almost anti-theoretical stance by arguing 
that each organization is unique — much as psychologists shield themselves from accusations of failure by 
pointing out that each person is unique. 

An interesting aside is the raising of the spectre of Marx by Robert Kahn in his chapter. Earlier in the 
book Miller would seem to want to assign responsibility for the ‘socio-technical concept’ to Rice himself. 
But the concept of an interaction between productive technology and the modes of social relations must 
surely be laid at Marx’s doorstep. 

Leaving aside the many criticisms which can be made from the academic vantage point, this collection 
should certainly appeal to, and contain much that is valuable for the organization analysts and developers. 
The organizational settings include industry, military, church, university, prison and hospital (nine of the 
contributors are listed as working in clinical settings). 

The various contributors show a traditional regard for the human consequences of organizational action, 
and one of the dominant themes is how to improve the lot of the less powerful. However, industrial and 
social psychologists may be disappointed by a lack of closely observed studies of human behaviour. All too 
often the facts of organizational life for the individual are taken for granted by committed and enthusiastic 
analysts. i 

Of course, behaviour within organizations cannot usefully be viewed independently of the behaviour of the 
organzation. The contributors try to forge this link with the concept of the organization’s primary task, but 
ultimately one may be inclined to ask if the concept of primary task does not disguise many of the complex 
problems of organizations and for at least some serve to allow conflicting data and perspectives to be 
swallowed in one. 

JOHN LOCKWOOD 


Human Reliability in Quality Control. Edited by C. G. Drury and J. G. Fox. London: Taylor & Francis. 1975. 
Pp. xii+315. £7.00. 


People make mistakes - hence the need for industrial quality control; but the inspectors make mistakes too, 
and these are the subject matter of this book. It arises from a conference held in New York in 1974, and 
consists of papers by English and American psychologists, ergonomists and engineers. The approach is 
optimistic; indeed, it seems at first that the intended reader (an engineer, one gathers) is to be presented with 
a comprehensive and, more important, immediately applicable body of human error data and relationships. 
Error ergonomics has not, however, attained such completeness and this becomes apparent; the book has, as 
it were, a common subject matter without having a common theme. 

Nearly 20 papers are grouped into three parts in order of decreasing theoretical content, with brief 
introductory and summary material by the editors. Part I reviews, somewhat uncritically, visual search, 
decision making and vigilance models as candidates for a theoretical foundation for inspection performance, 
which each can describe in part. Signal detection theory finds the most application in the later papers; 
perhaps this is why it is introduced afresh three times. Two of the papers here are primarily reports of 
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experimental work and the space might more appropriately have been devoted to other theoretical 
approaches, for example perceptual skills or feedback in training, which would have led the way to some of 
the succeeding material. 

The second part discusses factors affecting inspection performance. In a candid review of individual and 
group differences (largely unrelated to inspector performance) Wiener concludes that training, motivation and 
job design offer potentially stronger angles of attack, but unfortunately there is little mention of these 
elsewhere. It is symptomatic that the book’s rather superficial subject index lists jam tarts but does not refer 
to learning or task analysis. The other contributions deal with specific variables, with interesting practical 
advice on lighting from two workers at Kodak and an extensive review of the effect of movement in 
conveyor-belt inspection. Training is the subject of two papers but both are restricted to a few parameters of 
KR. 

The final part, entitled ‘Industrial applications’, illustrates the application of skill and experience rather 
than of theory. Rigby & Swain rightly stress the importance of accident and incident records and, all too 
often neglected, subsequent analysis, and they offer a delightful anecdote about a selection problem in a 
lemonade factory, the solution of which owed nothing to psychology and a great deal to an intuitive 
appreciation of human motivation. These authors are clearly immersed in a wide range of practical problems 
and adopt a practical and, dare one say it, common-sense approach for which their psychology is none the 
worse. 

Most of the remainder are sound applications of experimental psychology. As the editors remark, we can 
measure performance in particular circumstances and describe qualitatively the effects of variables, indeed 
the book amply illustrates this, but this is all we can do; in industrial application it is perhaps enough. It 
seems that theory is of more benefit when used as a guide in the solution of a problem than when extended 
towards a point prediction of the effect of a single variable; correspondingly, the more satisfactory parts of 
this book are the more general ones. 

B. N. M. DUTTON 


Managing People at Work. By O. Jeff Harris, Jr. Chichester: Wiley. 1976. Pp. xxi+581. £9.20. 


The initial objective of providing a basic reference work has in my opinion been fulfilled. The book takes 
basic but rigid steps through concepts of management providing cases along the way. The work is well 
documented and is liberally illustrated with diagrams and models of excellent clarity. The contributary chapters 
deal in detail with explanations of behaviour of people at work within working organizations, giving 
eminently suitable case studies at the end of each chapter. It is unlikely that the book could be used on 
specific academic courses except those of total technology, i.e. it is considered that the book would be 
useful for engineers doing higher level courses to obtain managerial/supervisory expertise. As a basic 
reference work, the book could act as an anchor on most other management topics, but it would be 
necessary to use a considerable amount of back-up material from other authors to explore matters important 
to management to a large extent. In this context the book is considered extremely valuable from the point of 
view of management studies. 7 

Chapter 18 is considered particularly weak in UK terms as the management of conflict resides much closer 
to the point of production or service, i.e. at first levels of supervision. On p. 368 the managerial grid (Blake 
& Mouton), whilst adequate in theory, does not allow for the effect of extreme environmental conditions, 
e.g. higher levels of unemployment in all sectors. The pressures created by such external conditions are not 
mentioned, although they are referred to in the Model of Worker Behaviour and its determinants F.24.1 
p. 500/501. The behaviour model is useful for the ab-initio management student giving a good basic insight 
into the build-up towards actions and behaviour. Referring again to Blake & Mouton’s grid — the 1-9 
Manager/Supervisor rating may be the best under certain environmental conditions. Economic change effects 
capacities for management and it is vital that management understand this and are aware of the power 
structures which can be modified and controlled with economic environmental condition change. The book 
does not give sufficient space to this most important aspect of managing people. 

A good basic management text, sufficient references to produce back up material for the special research 
areas in management are given. The models, diagrams, etc., are extremely well presented for basic reference 
students and readers. 

BRIAN J. LAMB 
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The Psychology of Memory. By Alan D. Baddeley. London: Harper & Row. 1976. Pp. xvii+430. Cloth, 
£10.45; paper, £5.45. 


This soundly informative and much needed textbook takes, as its field, a body of research which has 
burgeoned in the past 20 years and to which Baddeley has himself contributed. In this field, human memory 
is viewed as ipiis computer-like, information-processing system, and the research task is to explore 
this system by formulating hard-nosed conjectures about its functional design and attempting to test these in 
the experimental, psychological laboratory. The field is difficult for an outsider to penetrate because it is still 
proliferating, its conjectures are mostly at fairly high levels of abstraction, its experiments are usually 
technically sophisticated, and its literature is often couched in language that borrows heavily from computer 
science. In his ur Baddeley attempts the formidable task of surveying the contemporary state of this field 
at a level suitable for advanced undergraduates. All in all, he has succeeded impressively well. His book is 
careful, meaty, balanced, and lists about 800 references, mostly of recent date. His text is a clear leader in 
its field, and likely to be for some time. General readers should, however, be warned that the book is 
explicitly not concerned with everyday questions about memory in educational, medical, legal, or 
corss-cultural contexts. The focus is on work conducted in the experimental-psychological laboratory to test 
fairly abstractly formulated notions about the memory of ‘standard’ adult humans. In keeping with this 
focus, there is only a little about physiology, animals, and the effects of brain injury; and nothing about 
ageing or about children's development of memorizing strategies. 

The book starts by considering memory as an overall system. The first chapter outlines the work of 
Ebbinghaus to show the need to simplify questions in order to make them experimentally tractable, and then 
the work of Bartlett to caution that inappropriate simplification is self-defeating because it excludes what is 
most characteristic and interesting about memory. The next four chapters consider, respectively: input 
limitations of the system; storage and consolidation of memory traces; loss of information from memory; 
and various attempts to explain all forgetting in terms of interference among associations. Baddeley then 
turns to work aimed at breaking down memory into subsystems and at understanding these by using an 
information-processing approach. A chapter entitled * How many kinds of memory?' is the first of three 
devoted to short-term memory and to the evidence for and against a distinction petween long- and short-term 
memory. The next two chapters give a welcome survey of recent work on visual, auditory, kinaesthetic, 
tactile, and olfactory memory. Then come three chapters on the role of meaning in memory: one on the 
organization which shows up when people attempt to recall lists of unrelated words; one on memory for 
prose; and one on semantic memory. Tbe penultimate chapter turns to people with exceptionally good 
memory, and then considers mnemonics in terms of what is known about normal memory. The final chapter 
asks ' Where do we go from here?'. Taken as a whole, the book does not arrive at any dramatic new 
theoretical synthesis, but it brings together in one place an enormous amount of intricate theory and 
experiment, and does so with sustained clarity, scholarship, and fair mindedness. 

In its field, this text will play an important role in conserving, disseminating, and furthering knowledge. 
Apart from good service in advanced undergraduate courses, it will give researchers a helpful and 
encouraging background reference. Some researchers have recently bemoaned lack of cumulative progress in 
this field. The text shows that much of the work has, in truth, been piecemeal, trial-and-error and, to use the 
current phrase, phenomenon-driven; but also that, despite ups and downs and blind alleys, progress has 
unquestionably been made. Finally, the text will give reliable information about the field to psychologists and 
others. who seek a broad overview of the many-faced attempts to understand man's nature. The objectives 
of the field are often misunderstood, and the field itself is sometimes dismissed on the assumption that it is 
tilled by clever children who know more about computers than about human beings and have a perverse 
passion for gadgetry, superficial ideas, and obscurantism. Baddeley's text will soften this assumption. A 
century has not yet passed since Ebbinghaus launched the then-novel enterprise of devising special 
laboratory tasks in order to answer, under controlled conditions, certain questions about memory. Many 
questions still elude scrutiny by such an enterprise, and perhaps always will, but Baddeley's survey shows 
that the enterprise has now acquired signs of making a lasting, and distinctively twentieth century, 
contribution to human knowledge about human capabilities. 

IAN M. L. HUNTER 
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Neural Mechanisms of Learning and Memory. Edited by M. R. Rosenzweig and E. L. Bennett. Cambridge, 
Mass., and London, England: MIT Press. 1976. Pp. xiii--637. £20.10. 


This massive tome includes some fascinating and useful material, but falls far short of its intentions and 
indeed arguably could never have realized them in the current state of the neurosciences and their relation to 
psychology. Author after author admits in conclusion that what he has been discussiong tells us nothing of 
the neural mechanisms of learning and memory. As the neurosciences advance, the problem of the physical 
basis of learning can be posed from better background information and with greater technical precision, but 
the number of probable and relevant answers remains as few as ever. 

The book might most favourably be regarded as a close relative of the Neurosciences Study Program 
series, although without the crispness and more generally high critical standard of those symposia. 
Synaptogenesis, neuronal cell cultures, and invertebrate neurophysiology are among the topics addressed. 
Information processing, ethological and animal behaviour process approaches to learning are each 
represented. Neural modeling and memory is presented with unusual clarity and brevity. Attempts to identify 
and to interpret effects of drugs, electrical stimulation, and experimental and clinical brain damage on 
memory are understandably if not necessarily instructively included, as are also the anatomical and 
biochemical effects of differential experience or training tasks. 

Many topics are treated in a largely theoretical review coupled with a reply. This can be a useful and 
interesting format. In this case, however, all too often the reply is rejecting the orientation of the review as 
conceptually inappropriate, which suggests that one or other author or the whole field is a good way from 
clear empirical advance. Other weaknesses include too much fuzzy formulation and the introduction of data 
or arguments which are almost certainly not germane to the specific theme of learning and memory, 
interconnected with everything else though that is. There is heavy citation of secondary sources and serious 
imbalance. An author whose writing I should know as well as anyone is cited incorrectly half the time and 
gratuitously on the other occasions. The electrophysiology of mammalian learning and recall — probably the 
only areas which is currently yielding any results directly bearing on the problem — is represented only by the 
occasional one- or two-column summary hidden in other material. An account is offered of the prospects for 
applications of research on neural mechanisms of learning and memory, no doubt because the 1974 
conference on which the book is based was sponsored by the US DHEW National Institute of Education. 
Frankly, the justifications are specious. They include a false analogy with the usefulness that germ theory 
proved to have, and a non sequitur from ‘neural immaturity’ to ‘infantile amnesia’. The effects of drugs in 
hyperkinetic children might be a connexion between neuroscience and education, albeit a controversial one, 
but the phenomena did not depend on research into the neural bases of learning and memory. The 
eccentricities extend to a two-page joke proof that there really is a ‘little man in the head’, although not quite 
of the sort that theorists even these days have accused others of invoking. Hopefully, Rozin will not mind 
even that effort being characterized as slightly flat-footed. 

So what do you do with a curate’s egg? If you are short on some specific, you can search the contents to 
see if it is there and digestible. Otherwise, treat it as a sample from at least that sort of grocer’s shop. 

D. A. BOOTH 


Introduction to Multivariate Techniques for Social and Behavioural Sciences. By S. Bennett and D. Bowers. 
London: Macmillan. 1976. Pp. xii-- 156. £10.00. 


This book aims to provide a multivariate equivalent to the many currently available texts covering univariate 
statistics at an ‘elementary’ level, by describing various topics in multivariate analysis with almost no use of 
matrix algebra. Such an approach will, no doubt, appeal to some people, but it does have distinct 
disadvantages and certainly causes difficulties at several points in this text (£or example, in chapter 5). 

A major part of this work (essentially five out of the nine chapters), is devoted to various aspects of factor 
analysis, beginning in chapter 2 with a description of the long defunct centroid method. The rotation of 
factors is covered fairly well in chapter 3, and principal factor analysis and principal component analysis 
appear in chapter 4. Other methods of factor analysis are also covered in chapter 4, although only very 
briefly. Devotees of maximum likelihood factor analysis will note with regret that the only mention of this 
technique is by a reference to Lawley's 1940 paper — some fairly important ‘later’ developments due to 
Jéreskog and others appear to have been overlooked! 

Chapter 5, entitled ‘Multiple groups analysis’ is again predominantly concerned with factor analysis 
although a section on McQuitty's linkage analysis is included, as the book's only example of a clustering 
technique. Indeed, not even any references to other clustering methods are given, although the authors do 
concede in a footnote that other more complex (and perhaps preferable) methods of cluster analysis exist! 
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Chapter 6 is a very disappointing account of multidimensional scaling. Fifteen to twenty years ago it may 
have been reasonable and fairly up to date. Now, however, it appears irrelevant. The only reference to 
‘current’ work is to Kruskal's 1964 paper and this is essentially only in passing. 

The remaining three chapters include a well-written and useful account of discriminant analysis, and a 
chapter concerned with the analysis of qualitative data, using repertory grids, latent structure analysis, etc. 

_ The final chapter presents an overview of multivariate analysis which may be quite helpful to many 
investigators. 

To provide an ‘elementary’ text on multivariate techniques is almost a contradiction in terms because of 
the complexities in general of multivariate data. These authors have, however, in many respects, made a 
fairly brave attempt at this very difficult task. The main criticism of their work is that as well as equating 
elementary with ‘no matrix algebra’ (which may be acceptable), they appear, at least in some areas, to have 
equated it with ‘obsolete’, and this makes much of this text seem rather old fashioned. A further criticism 
which can be made, not now of the authors, but of the publishers of this book, is the wholly unrealistic price 
applied to it — inflation may be rife but surely not this rife! 

B. S. EVERITT 


Statistics and Experimental Design for Behavioral and Biological Researchers. By Victor H. Denenburg. 
Washington and London: Hemisphere Publishing. 1976. Pp. viii-- 344. £13.25. 


There have been numerous textbooks offering an introduction to statistics for non-mathematicians in the 
behavioural and social sciences but only a few have been sufficiently successful to become standard texts 
running into many editions. This book is unlikely to become one of the few not least because of the strange 
reasons given by the author for his belief that it will succeed where predecessors have failed. 

He argues that previous texts have been too broad and have included methods which he has never used in 
more than 20 years of research. He instances precentiles and, with staggering egocentricity, argues that all 
methods which he does not use should be eliminated from textbooks for those contemplating research using 
animal preparations, an argument which depends on the naive statistical assumptions that he is a 
representative sample of n= 1 from this population, that statistical methods have remained static for more 
than 20 years and that the next generation of psychologists in this field will continue using only those 
methods he has used and will research only in those specific areas which have interested him. 

Thus the book rests on the simple faith that, if only his tutors had known exactly what he would do in 
later life, they could have restricted his course to the minimum cookbook exposition of those methods he 
would use. This is tantamount to arguing that children who will become boxing referees should be taught to 
count only to ten and those destined to be cricket umpires to count only to six. With this strange argument 
in mind he looked over his old college notes so as to exclude anything he was taught but has not used. In fact 
this turns out to be very little since the book covers most of what competing texts cover: chapters on 
measurement, on simple descriptive statistics, on sampling and estimation, on tests of significance and 
simple analysis of variance, on simple regression and correlation and finally, as though an afterthought rather 
than the central techniques in psychological research which they have become, two chapters on chi-square 
and other non-parametric methods involving ordinal and nominal data. 

Despite the fact that his emphasis on utility effectively precludes it, he then asserts that his emphasis will 
be on the logic of statistics rather than on cookbook procedures. In fact, what he calls statistical logic turns 
out to be nothing more than unsupported arguments for dispensing with statistical logic and substituting 
laboratory myths created to justify rituals invented by people whose acquaintance with statistical methods 
has not risen above cookbook level. Indeed, he argues explicitly in relation to sampling and implicitly in 
other areas that ‘the procedures demanded by the theoretical statistician. . .and the practical concerns of the 
day-to-day experimentalist are simply incompatible in most instances' (p. 43). If true in his area of research 
this would seem an excellent reason for abandoning statistical methods in favour of other, more suitable, 
methods but it seems a most peculiar reason for writing an introductory text advocating their use despite this 


. limitation. 


. The psychologic he uses to escape the dilemma is to argue that, if the statistical rules cannot be followed 

the results can be interpreted as though they had been. Thus, arguing that random samples are difficult or 
impossible to come by in animal research, he assets that results from non-random samples can be accepted 
without qualification if they confirm or extend a set of established findings and can be accepted if a 
repetition of the experiment gives similar results in cases where new phenomena are uncovered or the results 
conflict with established findings. A better strategy for perpetuating existing theoretical biases and prejudices 
would be hard to come by. 


14 PSY 68 
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Nowhere does he discuss the actual logic of statistical methods and hence gives no indication that 
statistics is a controversial and developing discipline; that, for example, many statisticians have little or no 
confidence intervals, regard the logic of test of significance as fallacious and prefer Bayesian to classical 
statistical strategies and so on. 

Perhaps there are those, outside the psychology section of the flat-earth society, who share the author’s 
psychologic and to whom this book could be recommended. But for those who accept that statistics is a 
controversial and developing discipline which can aid, but not replace, the use of intelligence in experimental 
design and data analysis, the obvious strategy is to rely on existing texts. For those interested in research 
involving animal preparations, however, the book does provide a useful source of relevant statistical 
examples. 

A. B. ROYSE 
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Social Influence and Social Change 


S. Moscovici 
January 1977, xii+240pp., £7.00/$15.25 0.12.508450.1 


This work Is a stimulating reassessment of the currently accepted model of social behaviour. By skilful argument, the 
author outlines a new model which contends that the social system and the environment are the products of the 
individuals who live and work within them. Normal behaviour is not seen as a slavish desire to conform, while 
disinclination to adapt Is not necessarily a form of deviance; more lIkely it Js an endeavour to seek and develop the 
capacity to assimilate selectively, and to create new ways of thinking and acting and to participate in the growth of 
the group. 
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In a world where mass communication plays an important part in our attitudes towards people, ideas, events and 
objects, the wording of a message may well be the most significant factor in persuading us to adopt a particular point 
of view. How susceptible are we to linguistic style? How profoundly can the Impact of a message be enhanced by the 
manner in which it is written? This stimulating study approaches these questions from an entirely fresh viewpoint. 
Instead of using a general psychology of language, the author introduces "*psychostylistics", the differential 
psychology of language which outlines a new pathway to the understanding of language as a manipulatory tool. 
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These papers take a lively, modern approach to the topical problem of mother—child interaction, taking care to present 
the relationship as a dyadic one; the effect of the child's presence on the adult is as considerable as that of the parent 
on the infant. The approach is marked by the adoption of microanalytic techniques of study, showing how unsuspected 
erns can be revealed by close analysis of the extremely complex interactive behaviour which is in existence 
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etween mother and child long before linguistic communication develops. 
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Presented here is a comprehensive reference source to the work of Cattell and other factorists. The basic alms of 
the book are to explicate the results of the multivariate analysis of personality and to integrate Cattell’s work with 
non-multivariate psychological theory. Topics under discussion include different methods of multivariate theory 
analysis and their application to the study of personality and motivation; types of personality test; the psychological 
meaning of the main personality factors discovered among normals, subnormals and children; the application of these 
findings to education and to clinical and industrial psychology, and finally their relevance to psychological theories. 
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Myth and Evidence 
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the widely-held view that the child's social experiences in the first 
few years of life exercise a disproportionate influence 'on later 
development, and that later experience has a less powerful effect. 


1976 / HB £6.50 / PB £3.95 


Sex and Personality 
H. J. EYSENCK 


The first extensive attempt to study the relationship between sexual 
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haviour, and account for profound differences in the attitudes of 
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How does a bullfrog stake out its territory? And how do bees find 
new recrults? The answers to these questions are to be found 

in Thomas McGill's fascinating book on animal behaviour. Source 
material from the disciplines of endocrinology, ethology, genetics, 
physiology, psychology and zoology covers both theoretical and 
experimental aspects of the subject. 


ISBN 0030899265 544pp Paperback £6.75 


Abnormal Psychology 


A comprehensive introduction to the field of abnormal psychology 
covering the history and origins of various conditions, and 
treatment methods. The book is organised into four parts: *Views 
of Abnormal Behaviour *Non-psychotic Disorders and their 
Treatment *Psychotic Disorders and their Treatment *Social 
Deviation and Impaired Brain Function. ; 

Case Studies, examples and clinical reports — with plenty of 
illustrations — clarify the text. 
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Psychiatric counselling often consists of very close interpersonal 
relationships, and in order to help the client or patient to the 
utmost, this practical-text presents a basic helping-training model 
stemming from three phases of learning: exploration, understanding 
and action. 


ISBN 0030898129 320pp Hardback £8.00 
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A comprehensive introduction to psychology which combines 
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psychoanalytic and social /behavioural points of view. 
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Fundamentals of Physiological 
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This conceptual approach to the structures and mechanisms of the 
brain and the control of behaviour ranges from cell study to a 
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A Piagetian Perspective 
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Quantitative subjective assessments are almost always biased, sometimes 
completely misleading* 


E. C. Poulton 





Subjective assessments are used widely in civilized societies for evaluating people, the work they do, and 
the stresses they work under. Yet most of the people who use subjective assessments are probably not 
aware of the pitfalls which they involve. 

Range effects introduced by the investigator almost inevitably bias quantitative subjective assessments, 
unless very special precautions are taken to prevent or eliminate the bias. Stevens' method of direct 
magnitude estimation is always biased by range effects. The international standard for calculating the 
loudness of noise corresponds to a 30 decibel range of sound intensity. 

When the investigator takes special precautions not to introduce bias, the observers may use their own 
well-known standards which they bring with them to the investigation. In a height constancy investigation 
conducted in a field, the well-known standard is the height of a hedge or fence. In judging just acceptable 
noise levels, the well-known standards used by the observers must depend upon the noise levels which they 
are accustomed to. 

The chief danger of subjective assessments is that they may be based upon a well-known rule which does 
not happen to apply in the particular circumstance of the investigation. A result may then be obtained which 
is the direct opposite of the truth. Examples are the well-known rules that noise and heat interfere with 
work. Yet performance can improve in both noise and heat. Results of this kind are revealed by measuring 
performance, not by asking people how noise or heat affects them. Subjective assessments should be used to 
complement measures of performance, not to replace them. 





Subjective assessments are used extensively in civilized societies as part of the technique of 
management. They are used to evaluate people's performance in the Civil Service, in industry, 
and even the performance of children at school. Scientists use subjective assessments to indicate 
the value of the work done by other scientists. They assess subjectively the merits of other 
scientists’ applications for research grants. Subjective assessments are used also in social and 
clinical psychology research. 

In industry, attitude measurement has long been used to quantify the relationship between 
labour and management. The more recent work measurement also includes subjective 
assessments, for example of work complexity. The author was first introduced to subjective 
assessments over 20 yéars ago at Harvard, when working with the late S. S. Stevens on methods 
of measuring the subjective effects of noise. Subjective assessments are used also to define 
comfort zones for the temperature of offices and factories (ASHRAE, 1972), and to specify the 
limits of acceptability for glare, and flickering light (Hopkinson & Collins, 1970), and for 
vibration (International Organization for Standardization, 1972). 

One suspects that investigators now sometimes use subjective assessments instead of 
measuring performance simply because subjective assessments are quicker and easier, and so 
cheaper, to obtain. It has been suggested that the initial choice between designs for a new piece 
of industrial equipment should be based upon the preferences of a number of experts or possible 
users, instead of upon measures of efficiency (Knowles, 1967; Knowles et al. 1969). The basic 
assumption is that by and large people know. All you have to do is ask them. 

How adequate are quantitative subjective assessments? The short answer is that no method of 

* The Tenth Annual C. S. Myers lecture given on 5 April 1976 at the University of York. Two sentences 
of the original version have been deleted by the Editor 

This lecture has prompted a reply from Dr D. E. Broadbent (see pp 427-429) and further comments from 
Dr Poulton may be found in a later issue of the Journal Meanwhile, readers may wish to consult 


Psychological Bulletin, 1977/8 where it is expected that some of the issues will be discussed by both 
authors 


15 d . PsY 68 


410 E. C. Poulton 


measurement is perfect when difficult judgements have to be made. But most people who use 
quantitative subjective assessments probably have no idea of the pitfalls. Objective measures of 
performance are far preferable whenever they are available. 


Stimulus and response range effects 
Quantitative subjective assessments are particularly susceptible to range effects. Range effects 
are produced by transfer of training when each person is given a number of different 
experimental conditions. Range effects produce more bias than most psychologists like to admit 
in experiments measuring performance (Poulton, 1973, 1974, 1975). They are particularly marked 
in quantitative subjective assessments. 

Stimulus range effects are produced by the range of stimuli selected by the investigator. Once 
the observer has learnt the range, he tends to judge each stimulus partly by its position in the 
range. This is illustrated on the left side of Fig. 1. 


Stimulus range effect | Response range effect 
Stimulus Response Stimulus Stimulus Response 


Quiet 





Figure 1. Model for stimulus and response range effects in rating noisiness. 


Responses are selected from a six-category rating scale, with the two extreme categories not 
labelled. The range of noises on the extreme left is pretty intense, 100-80 dB. But once the 
observer has learnt the range, he tends to put the 90 dB noise from the middle of the range in 
the middle of the rating scale. This is indicated by the dashed line and the arrows on the left, 
which slope downward. 

The range of noises to the right of the rating scale is less intense, 90-70 dB. Here again the 
observer tends to put the noise from the middle of the range, this time 80 dB, in the middle of 
the rating scale. So the 80 dB noise from the less intense range of noises tends to receive the 
same rating as the 90 dB noise from the more intense range of noises. In practice the observer's 
judgement is a compromise. It is determined partly by the actual loudness of the noise, and 
partly by the position of the noise within the range of noises. 

Response range effects are produced by using a limited range of responses with an obvious 
middle. This is illustrated on the right of Fig. 1. The observer tends to select a rating closer than 
he should do to the middle rating, whatever the intensity of the noise which he is judging. Thus 
the range of ratings used is smaller than it should be. Response range effects are found in pure 
form only for very first judgements. Judgements after the first are affected also by the range of 
previous stimuli. 

Both stimulus and response range effects can be said to represent the central tendency of 
judgement (Hollingworth, 1910). In stimulus range effects the range of responses is centred upon 
the range of stimuli. Response range effects are easiest to remember in terms of caution. The 
observer plays safe, and selects a response a little too close to the middle of the range of 
responses. 
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A stimulus range effect in rating noisiness 


Figure 2 illustrates some beautiful data published by the Acoustics Section of the National 
Physical Laboratory. The filled points represent motor vehicle noise, taken from Robinson, 
Copeland & Rennie (1961). The unfilled points represent aircraft noise, taken from Bowsher, 
Johnson & Robinson (1966). The vertical lines show the range of noises in decibels. The short 
horizontal lines show the mid-points of the ranges. The vertical lines have been separated 
horizontally to make all the mid-points of the ranges lie on the same straight line, indicated by 
the dashes. 


Sound pressure level in decibels (A) 
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Figure 2. À stimulus range effect in rating the noisiness of road vehicles (filled points) and of aircraft 
(unfilled points). Results from Andrews & Finch (1951), Bowsher et al. (1966), Weber & Lauber of the 
Generaldirektion PTT (1959), and from Robinson et al. (1961). 


The diamonds show the mid-points of the straight lines fitted to the ratings. In the National 
Physical Laboratory data this is the transition between ‘acceptable’ and ‘noisy’, or between 
‘moderate’ and ‘noisy’. If the observers behaved like sound level meters and had a fixed 
criterion for the transition between ‘moderate’ and ‘noisy’, all the diamonds would lie on a 
horizontal line. If the ratings were determined entirely by the ranges of sounds presented, the 
diamonds would be superimposed upon the short horizontal lines. The dotted line fitted to the 
diamonds by eye is a compromise. The mid-points of the ratings are determined partly by 
the actual sound levels, and partly by the ranges. 

In the Andrews & Finch (1951) investigation on the left of the figure, the diamond representing 
the mid-point of their ten-point scale of ‘objectionableness’ is too high. It is pulled up by the 
middle of the range of noise intensities. In the investigations on the right of the figure the 
diamonds are too low. They are pulled down by the middle of the range of noise intensities. The 
two lines cross at about 85 dB. Here a diamond would be pulled neither up nor down. The 
crossover represents the only unbiased point in all the investigations. 

The filled diamond near the top on the left, from the American investigation by Andrews & 
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Finch (1951), lies at 90 dB. It is 17 dB above the filled diamond near the middle on the right from 
the Swiss investigation by Weber & Lauber (Generaldirektion PTT, 1959). Both investigations 
are concerned with motor vehicle noise. In physical units, a difference of 17 dB represents a 
sevenfold increase in the amplitude of the noise. It represents a 50-fold increase in the noise 
power. In terms of loudness, it represents at least a fourfold increase. You would hardly think 
that different investigators could find such enormous differences in the just acceptable noise 
level. 

When discrepancies like these arise, physicists and engineers like to attribute them to minor 
differences in the physical parameters of the investigations. The noises used in the three 
investigations have different sound spectra. The noises are heard directly in the open air, or they 
are recorded on magnetic tape and presented in the laboratory by loudspeaker after 
amplification. The rating scales use slightly different verbal descriptions. Clearly minor 
differences in the physical parameters of the investigations must affect the results. But they do 
not produce discrepancies of this order of magnitude. 

The investigation by Bowsher et al. (1966) at the 1964 Farnborough Air Show, is as free from 
changes in the physical parameters as it is possible to be in a field investigation. The group of 
observers represented by the middle function in Fig. 2 judged the noisiness of the aircraft from 
an assembly hall which was fairly close to the landing end of the runway, and 100 metres from 
the glidepath. The group of observers represented by the right-hand function, made similar 
judgements at the same time from a church hall which was twice as far from the landing end of 
the runway, and 900 metres from the glidepath. 

The results from these two new groups judging aircraft noise are represented by the unfilled 
points. They fit in neatly with the results of the groups judging motor vehicle noise. The heights 
of the diamonds are clearly not due to minor differences in the physical parameters of the 
investigations. They are due to the different ranges of noise intensities presented. Parducci 
(1963) did much of the fundamental laboratory work on range effects in ratings. 

Biases similar to those illustrated in Fig. 2 occur in assessing the quality of research proposals, 
and in assessing the quality of essay-type examination answers. Here the diamonds might 
represent the pass mark. The Andrews investigation on the left might represent a good year. So 
rather over half are passed. The Weber investigation might represent a bad year. So less than 
half are passed. Yet there is no overlap between the two distributions. Any one of the Andrews 
year research proposals or examination candidates would have been passed in the poor Weber 
year. None of the Weber year proposals or candidates would have been passed in an Andrews 
year. This is an extreme example. But it represents a bias which research councils and 
examiners always need to guard against. 


A stimulus range effect with the method of constant stimuli 


Some of the most perfect stimulus range effects come from the method of constant stimuli. They 
occur when the same standard is used for a block of trials, and the observer always knows 
which stimulus of a pair is the standard. For example the standard can always be presented first, 
or to the left of the variable. In matching loudnesses, the standard may be a tone, while all the 
variables are narrow bands of noise. When the observer always knows which stimulus is the 
standard, he tends to neglect it after a while. He simply judges each stimulus against the range 
of stimuli presented to him. à 

Figure 3 illustrates the results of a test of pitch discrimination used in selecting American 
sonarmen (Harris, 1948). The man hears the same standard tone first, always perhaps 1 kHz. He 
has to say whether the variable tone, which always follows the standard, is higher or lower in 
pitch. The figure shows the percent of tones judged higher for various differences in frequency 
between the standard and the variable. A steep S-shaped function means good discrimination. 

The filled points and unbroken function represent the average of 148 men. For another group 
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Figure 3. A stimulus range effect in the method of constant stimuli when the observer always knows which 
stimulus is the standard. Results from Harris (1948). 


of 80 men, Harris stopped presenting the standard at all after the first 40 practice trials. The 
dotted function shows that the men’s subsequent judgements hardly changed. The function is a 
little less steep, but the difference is not reliable. 

With another group of 70 men, Harris increased the pitch of the standard by 3 Hz after the 40 
practice trials, without telling the men. The variables were left unchanged, arranged 
symmetrically around the original standard. The dashed function shows that again the men’s 
subsequent judgements hardly changed. The dashed function should have shifted 3 Hz to the 
right, to match the higher standard. This clearly demonstrates that the men were not attending to 
the standard. They judged each variable tone against the range of previous variables. It is an 
excellent example of a stimulus range effect. 


A response range effect in very first judgements of size constancy 

Figure 4 illustrates a beautiful response range effect for very first judgements of the height of a 
white post in a field, from Joynson, Newson & May (1965). The ordinate shows height in metres. 
The filled points connected by the unbroken lines represent the physical heights. The slopes of 
the two unbroken lines are arbitrary, but the same. 

The unfilled points represent the mean very first judgements of separate groups of 18 or 30 
people, who each judged the height of only a single post. The distance of the posts varied for 
different people from 25 to 250 or 400 metres. All points include judgements at all distances. On 
the left of the figure the heights of short posts are overestimated. On the right, the heights of tall 
posts are underestimated. 

Experiment | was carried out on an airfield. The dashed line fitted to the unfilled points 
crosses the unbroken line at a height of about 2 m. The height is neither overestimated nor 
underestimated on average. It is about the height of the perimeter fence round most airfields. 

Experiment 2 was carried out on a level field of mown grass. The fitted dashed line crosses the 
unbroken line at a height of about 1 m. This height, which is neither overestimated nor 
underestimated, is about the height of a well-kept hedge or fence round a field. In making his 
very first judgement, the man must use as a reference the height of a fence or hedge. When in 
doubt, he plays safe and selects a height a little nearer to this reference height than he should. 


414 E.C. Poulton 


Height constancy-very first judgements 


4 
8 
A 3 
8 
E 
& 
G 
o 
i] 2 
V 
T 
1 
0 





Figure 4. A response range effect in very first judgments of the height of a white post in a field. Results from 
Joynson et al. (1965). 


Well-known standards can introduce bias 

This response range effect is particularly important, because it is not determined by the range of 
responses carefully selected by the experimenter. The observers are free to use any heights they 
like. They use the height of a fence or hedge as a reference. These well-known standards which 

people carry around with them can influence any very first judgement when the investigator does 
not influence it himself. 

In very first judgements of noisiness, the well-known standard must be determined by the 
range of noises the people are used to hearing. This question never appears to have been 
investigated directly. If people do use their well-known standards, as the noise level in a district 
falls the just acceptable noise level will also fall. The loudest noises will always be too loud, 
however much they are reduced. This prediction follows directly from the data on just 
acceptable noise levels in Fig. 2. It means that an investigator will never obtain an agreed just 
acceptable level of noise, or of anything else, simply by asking people, because people's answers 
depend upon what they themselves are used to. 

In education, examination boards use standard questions which new examiners mark. They 
enable personal standards to be adjusted to the examination board norm. The board norm is then 
adjusted to produce what is believed to be an acceptable proportion of passes and fails each 
year. A procedure of this kind is all one can offer to physicists and engineers. Human 
assessments are relative, not absolute, so agreed subjective standards are necesarily arbitrary. 


Dealing with investigator-induced range effects 

The results in Fig. 4 show that however careful an investigator is, he cannot always avoid range 
effects. This is because people use their own standards which they carry around with them. But 
an investigator can avoid introducing a range effect of his own. The way of avoiding an 
investigator-induced stimulus range effect, is to restrict each observer to a single stimulus to 


judge. 
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The way of avoiding an investigator-induced response range effect, is to use a response scale 
without obvious middle categories. With ratings this means presenting only two choices, for 
example ‘acceptable’ and ‘noisy’. When not restricted to ratings, another way of avoiding a 
response range effect is to provide the observer with an unlimited range of responses to choose 
from, for example numbers ranging from zero to infinity as in the experiment of Fig. 4. There is 
then no obvious middle response category for the observer’s responses to converge onto. 

Avoiding stimulus range effects means letting each observer contribute to only one point on a 
psychophysical function. Instead of a few trained observers providing the complete 
psychophysical function, it needs hundreds of observers. This may be why psychophysicists 
have turned a blind eye on the range effects which they generate by their experimental 
procedures. 

If investigators are not willing to go to the trouble and expense of generating unbiased data, 
they ought to devise methods of correcting for the bias which their investigation introduces. This 
can perhaps be done by making spot checks, using separate groups of new observers for each 
point. The biased psychophysical function obtained from the few trained observers can be 
adjusted to fit the points provided by the separate groups of new observers. 


Stimulus range effects in direct magnitude estimation 


The late S. S. Stevens advocated direct magnitude estimation as a method of avoiding the 
problems of range which are inherent in the use of rating scales. With the publication of his 
posthumous book (1975) and of the collected papers in his honour (Moskowitz, Stevens & Sharf, 
1974), Stevens’ method of direct magnitude estimation is probably at the height of its fame. 
Figure 5 shows the first thorough investigation of stimulus range effects in direct magnitude 
estimation (Engen & Levy, 1958). The average subjective assessments of the loudness of a 1 kHz 
tone are plotted on a log scale against the intensity of the tone in decibels. The smaller stimulus 
range is half the size in decibels of the larger range. The ratio of the two slopes is 1-5 to 1. 
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Figure 5. A stimulus range effect in direct magnitude estimates of the loudness of a 1 kHz tone. Results from 
Engen & Levy (1958). 


This is a stimulus range effect, but it takes a different form from the stimulus range effect with 
ratings. The observer starts an experiment on magnitude estimation with what he believes to be 
a sensible range of responses. He distributes these responses over whatever the range of stimuli 
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presented to him. Thus the observer who is given the smaller stimulus range produces the 
steeper slope or exponent when the data are presented in the form used in Fig. 5 (Poulton, 1968). 

If there were no stimulus range effect, the two functions for the two separate groups of 
observers would have the same slope. If the results were determined entirely by the stimulus 
range effect, the ratio of the two slopes would be 2 to 1, not 1-5 to 1. As for the ratings of 
noisiness in Fig. 2, there is a compromise between a pure stimulus range effect and the ‘true’ 
value. 

The observers judged in terms of ratios; 100 points had to be divided between each pair of 
sounds to indicate the ratio of their loudnesses. So the data are not influenced by a response 
range effect produced by a limited range of responses with an obvious middle value. 

It is not possible to eliminate stimulus range effects in direct magnitude estimation, because 
any two stimulus values to be judged against each other define a stimulus range. The range 
necessarily influences the very first judgement made by an unpractised observer. Judgements 
after the first are influenced also by the range of stimuli judged previously, as illustrated in Fig. 
5. 

Figure 6 shows very first judgements of loudness and softness. The filled points are for 
multiple estimates of the loudness of an octave band of noise centred on 0-9 kHz (Poulton, 
1969). The unfilled points are for the loudness or softness of a 1 kHz tone, from McRobert, 
Bryan & Tempest (1965) and Tempest, Bryan & McRobert (1965). Again the observers use 
multiple estimates. So the data are not influenced by a limited range of responses with an 
obvious middle value, as fractional estimates are. Each point represents a separate group of 
between 25 and 100 observers, making a very first estimate of loudness or softness. 

All three functions bend over toward the right. So the imaginary lines connecting the standard 
at the origin on the left to each point becomes less steep as the distance of the points from the 
standard increases. Both the vertical and the horizontal scales are logarithmic. So the slopes of 
the imaginary lines are the exponents of Stevens' power functions. The slopes or exponents 
become smaller as the distance from the standard increases. This is the stimulus range effect in 
very first direct magnitude estimates, which cannot be got rid of. 

Figure 7 shows the data of Fig. 6 replotted with an ordinary arithmetic vertical scale to get rid 
of the curvature in the functions. The agreement between the three functions is pretty good 
considering the small numbers of observers and the lack of any previous practice. Social surveys 
use at least 200 people to get decent data. The deviant filled and unfilled circles at the top of the 
figure represent only 33 and 24 observers respectively. The sloping unbroken line has been fitted 
to all the remaining points. 

Figure 8 shows the exponents calculated for each of the points in Fig. 6. The exponents are 
plotted against the reciprocal of the distance between each point and the standard, in order to 
correspond to Fig. 9. The curved function is the fitted straight line of Fig. 7. The deviant filled 
and unfilled circles of Fig. 7 are on the extreme left just above the curved function. 

The broken horizontal line with an exponent of 0-6 is Stevens' (1955) sone scale, which he got 
accepted by the USA Standards Association (1968) and by the International Organization for 
Standardization (1966). The curved function shows that Stevens' exponent of 0-6 corresponds to 
8 log geometric stimulus range with a reciprocal of 0-33, or a range of 30 dB. The 30 dB 
presumably represents the average range of intensities presented at any one time to the 
observers in most of the experiments published before 1955, upon which Stevens bases his sone 
scale. 

Warren's (1958) physical correlate theory makes loudness inversely proportional to the square 
of distance. It equates loudness with sound pressure or amplitude. This requires an exponent of 
1-0. The curved function in Fig. 8 shows that it is obtained with a log geometric stimulus range 
of 1, or 10 dB. This is about the stimulus range which Warren uses in his experiments (1970, 
1973 a, b). 
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Figure 6. Stimulus range effects in very first direct multiple estimates of loudness or softness. Results from 


McRobert et al. (1965), Poulton (1969) and Tempest et al. (1965). 
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Figure 7. The data of Fig. 6 replotted on a loglinear plot. 
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Figure 8. Exponents calculated by Stevens' method from the points of Fig. 6. 
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Figure 9. Range effects in Stevens' (1962) exponents for different sensory dimensions. 


Which point to select on the curved function for the standard of loudness is quite arbitrary. 
Warren's exponent of 1-0, which makes loudness inversely proportional to the square of 
distance, corresponds to what people hear as they walk toward or away from a source of sound. 
Most probably people learn this in everyday life, and use it as their basis for making loudness 
judgements. 

On this view, the remaining points on the curved function are generated from the inverse 
square law for loudness by the stimulus range introduced by the experimenter. But it leaves one 
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point unexplained. Why does the inverse square law for loudness hold for a 10 dB difference 
between the two tones judged, and not for some other size of difference? An increase of 10 dB 
occurs when a source of sound becomes about three times as close. This happens as a child or 
adult hits an object at arm's length, and again when it is close to his body. It also happens as a 
person approaches three times as close to a source of sound in a room, from perhaps 3 m to 

1 m. This may well be the commonest proportional change of distance when listening to a source 
of sound. If so, it would explain why the inverse square law for loudness holds for a 10 dB 
difference. It is the change in intensity with distance which people have learnt the best. 


Stimulus range effects in comparisons between sensory dimensions 


In his Klopsteg lecture, Stevens (1962) lists the exponents obtained in his laboratory for 27 
sensory dimensions. Figure 9 shows the exponents plotted against the reciprocal of the log 
geometric stimulus ranges for the 21 sensory dimensions on which adequate experimental data 
have been published. 

In terms of range, Stevens' exponent is the log of the average subjective range used by the 
observers divided by the log of the physical range of the stimuli presented: 


log y range 
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: In the experiments which provide the data of Fig. 9, the average geometric subjective ranges 
run from 9 to 225, a ratio of 25 or 1-4 log units. Whereas the geometric stimulus ranges run 
from 3 to 3000, a ratio of 1000 or 3 log units. The ratio is 1-6 log units larger than the ratio of 
the subjective ranges. Since the subjective ranges vary so much less than the physical ranges, 
Stevens’ exponents can be approximated reasonably well by replacing all his subjective ranges 
by a constant subjective range of 27. 
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This is indicated by the straight line fitted to the points in Fig. 9. 

The Pearson product-moment correlation is +0-91. This indicates that 83 per cent of the 
variance in the exponents is accounted for by the stimulus ranges. It leaves only 17 per cent of 
the variance as the total contributed by the subjective assessments. This 17 per cent includes the 
differences between the small groups of observers used by Stevens, as well as the differences, if 
any, between the actual sensory dimensions. 

It was Teghtsoonian (1971) who first pointed this out. The earlier (Poulton, 1967) correlation 
coefficient was smaller, because it used Stevens' exponents and ranges exactly as he gave them. 
Stevens sometimes used amplitude, sometimes power. Power has to be converted to amplitude 
to get a high correlation. 

Figure 9 shows that a pretty reasonable estimate of Stevens' exponents can be obtained 
without using observers at all. Simply find out what stimulus range he used, and read the 
exponent off the graph. This works reasonably well, because Stevens' exponents are determined 
very largely by his stimulus ranges. 

The slope of the fitted straight line is 1-43 or log 27. It means using on average numbers 
ranging from about 4 to 100, or from 10 to about 250. This is the average of the range of 
numbers which Stevens and his graduate students used. It represents the well-known standard 
which was developed and used at Harvard. 
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Well-known rules can be misleading 


Subjective assessments have a major advantage over measures of performance. They tend to 
show more agreement between people than can usually be obtained with measures of 
performance. This is because measures of performance are variable, while subjective 
assessments tend to be influenced by well-known standards or rules, which most people agree 
on. 

In a recent investigation of the effects of wind (Poulton, Hunt, Mumford & Poulton, 1975), 
there were two wind speeds, each at two levels of gustiness or turbulence. Each of 20 
housewives performed a number of tasks in the wind. She then put a mark on a total of 13 lines 
like the one illustrated in Fig. 10 to indicate her subjective assessments. Each line was labelled 
differently. 


Walking E—————————————— —————34 
Sure Unbalanced 
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Figure 10. 100 mm line used to obtain quantitative subjective assessments of wind disturbance. 


On three of the 13 subjective assessments all four wind conditions were reliably different. 
None of the 14 measures of performance discriminated as well as this between the four wind 
conditions. Herbert (1974) found a similar advantage for subjective assessments in measuring the 
effects of disturbed sleep at night upon people the next day. His subjective assessments showed 
reliable differences between his conditions. His measures of performance gave less clear-cut 
results. 

There were two measures of performance to compare with the subjective assessment of Fig. 
10. One was the extent to which the housewife was blown off course as she first entered the 
wind. She wore inky pads strapped to the soles of her shoes, and walked on a white paper 
carpet 1 m wide. The other measure was the consistency with which she could walk in her own 
footsteps in the wind. It was given by the overall width of the composite footmarks, each 
comprising 5 footmarks. 

On both measures of performance the faster gusty wind was reliably worse than any of the 
other three wind conditions. And on the subjectively assessed unbalance, the faster gusty wind 
was also reliably worse than any of the other three wind conditions. So the agreement between 
the subjective and objective measures was good. 

But there was no reliable association between the position of the housewife's mark on the line, 
and either measure of her walking in the wind. Housewives who were blown about were no 
more likely to give a high unbalanced assessment than housewives who were not blown about so 
much. 

Perhaps the lack of association was due to the difficulty of deciding exactly where to mark the 
line. Fortunately each housewife had two wind conditions with the same average windspeed, one 
gustier than the other. So it was possible to compare the difference between the positions of the 
marks on the lines, with the difference between the measures of performance in the two wind 
conditions. Again there was no reliable association between the subjective assessments and the 
measures of performance. 

Most probably this is because the housewives were basing their assessments upon the well 
known rule that a strong gusty wind blows people about. This is all right as long as the 
well-known rule predicts the measures of performance, as it does here. But it has recently 
become clear that the well-known rules about environmental stresses do not always hold. 
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Well-known rules about human performance in noise 


There is the well-known rule that noise is distracting when it is irregular and intermittent. Noise 
also masks important sounds; that is, noise prevents a person from being able to hear what he 
wants to listen to. These two well-known rules about noise have not been challenged. 

Twenty years ago Broadbent put forward the now well-known rule that continuous unvarying 
noise has a direct detrimental effect upon work which is distinct from the effect of noise in 
masking sound. As possible mechanisms for the effect, he suggested distraction by the continuous 
noise, and blocks or internal blinks. Broadbent based his claim upon his first two experiments on 
noise (1953, 1954). The claim is made in his two reviews on noise, a review in the Handbook of 
Noise Control (1957), and chapter 5 of his book Perception and Communication (1958). 

Broadbent’s now well-known rule is included in three influential journal reviews on the effects 
of noise upon man, in the Psychological Bulletin (Plutchik, 1959), in the Journal of the 
Acoustical Society of America (Miller, 1974) and in Human Factors (Mirabella & Goldstein, 
1967). It is also included in four reviews of the effects of noise upon man in books (Burns, 1973; 
Davies & Tune, 1970; Kryter, 1970; Poulton, 1970). Kryter is the only author who refuses to 
believe Broadbent. 

Yet all the experiments in which continuous noise is found to degrade performance can be 
explained by masking (Poulton, 1977). The noise masks either the auditory feedback from the 
man’s control which can help him in the control condition. Or the noise masks the man’s inner 
speech which assists his short-term memory. 

In both Broadbent’s original experiments on noise, the noise masks the auditory feedback 
from the man’s control. Figure 11 illustrates five of the 20 dials used in Broadbent’s (1954) dial 
watching task. The arrows show that a clockwise rotation of a control raises the needles of some 
dials while it lowers the needles of other dials. This is confusing. If the man fails to move a 
needle in the correct direction within 9 sec of a signal, his performance is said to be ‘impaired’. 
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Figure 11. Five of the 20 dials used by Broadbent (1954) in an experiment on continuous noise. 
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The rotation of a control in the correct direction produces a click from the microswitch 
mounted directly behind it to record the response (Broadbent, 1950, Appendix 1). The man can 
hear the click in the control condition. So he can twiddle the control until he does hear the click. 
The click is masked by the noise. In noise each man has only to become confused once or twice 
more than in the control condition and make the correct response a little late, to produce 
Broadbent’s results. 

Broadbent’s (1953) second early experiment uses Leonard’s five-choice task of serial reaction. 
Here the man has to tap brass discs with a steel stylus. When the man just misses a disc and hits 
instead the paxolin board in which the discs are mounted, the tap has a lower pitch. In quiet the 
man can hear the change in pitch when he is not looking. The noise makes the pitch 
discrimination more difficult. But tapping hard helps to reduce the masking produced by the 
noise. 

Broadbent does not mention the masking of the auditory feedback in his experiments in either 
of his published papers. In both his reviews he discusses the theoretical and practical 
significance of the masking of auditory feedback, and quotes his own two experiments. But 
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again he does not mention the masking in his own two experiments. And he does not appear to 
have published a paper correcting the authors who quote his now well-known rule. 


Improved performance in noise 


Yet noise is not always detrimental. Noise is arousing, and behavioural arousal improves the 
performance of tasks which are simple and dull, or simple and require speed. Behavioural arousal 
helps to counteract the detrimental effect of noise in masking sounds. Behavioural arousal may 
also counteract the distraction of irregular intermittent noise. So noise can either improve or 
degrade performance. Which happens depends upon whether the beneficial effects of the increase 
in behavioural arousal are greater or less than the effects of the masking or distraction. 

There are experiments like Broadbent's where noise degrades performance by masking the 
auditory cues. In what is probably the majority of experiments, many of which do not get 
published, noise has no reliable effect. Here the effects of masking or distraction cancel the 
beneficial effects of increased behavioural arousal. 

But in quite a number of experiments noise reliably improves performance. There are 19 
experiments in which continuous unvarying noise reliably improves performance, and another 19 
experiments in which intermittent or varying noise reliably improves performance (Poulton, 
1976). Music while you work is an example. These 38 experiments are all exceptions to the 
well-known rule that noise degrades performance. 

Results of this kind are not usually reported when people are questioned about noise. The 
rating scales for noise which provide the results in Fig. 2 do not even allow favourable ratings. 
They contrast acceptable with noisy, inoffensive with annoying, and unobjectionable with 
objectionable. Ratings such as ‘bracing’ or ‘stimulating’ are not provided. People are not given 
the opportunity of indicating that noise can be beneficial, if they happen to notice that it is. 

Yet the popularity of discos, where people deafen themselves with musical noise, shows that 
many people like noise. Some people carry noisy transistor radios around with them all the time. 
It is not just noise in general, it is other people's noises which people object to. 


Beneficial effects of heat 


Figure 12 shows the theoretical relationship between time in heat and the man's level of 
behavioural arousal. When a man first enters a hot room, the heat hits him in the face. It 
temporarily increases his level of behavioural arousal. This is indicated by the steep rise in both 
functions on the left of the figure. 
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Figure 12. Model for the effects of heat upon behavioural arousal. 
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Later on the man’s level of behavioural arousal falls. This is indicated by the unbroken 
function at the bottom of the figure. The man may begin to feel sleepy if he has to perform a 
dull routine task. 

However the top function shows that if the body is actually being heated up, the level of 
behavioural arousal stays high. Just before the man collapses, he may become so excited that he 
refuses to finish the experiment. This is indicated at the top of the figure on the right. It is a 
built-in safety mechanism. If the man does not escape before he collapses, he never will unless 
someone is available to carry him out. 

The well-known rule is that heat degrades performance. This is what usually happens if the 
man is left in the hot room for an hour or two, and is then given some dull task to perform. 
What is different about the relationships in Fig. 12 is the initial increase in behavioural arousal 
on first entering the hot room, and the increase in behavioural arousal as the body temperature 
gradually rises. 

There are nine experiments in which performance improves reliably in heat (Poulton, 1976). 
The measures of performance which show the reliable improvements, in both heat and noise, are 
the measures which are affected by behavioural arousal: detections in vigilance tasks, the rate of 
work, and reaction time. The nine exceptions contradict the well-known rule that heat degrades 
performance. 

As with noise, the beneficial effects of heat may not be reported when people are questioned. 
An example is an investigation carried out by Allnutt & Allan (1973) at the RAF Institute of 
Aviation Medicine at Farnborough. Twenty trainee pilots had their deep body temperature raised 
to 38-5 °C (101-3 °F) while keeping their skin cool. In this condition the group of trainee pilots 
worked reliably more quickly than usual on a test on non-verbal reasoning, without making more 
errors. Yet, of the 15 pilots who worked more quickly, ten stated at the end of the investigation 
that they had worked more slowly. Presumably the ten pilots used the well-known rule that heat 
degrades performance. They cannot have been basing their answers upon their actual 
performance. 


Subjective assessments can supplement measures of performance 


Although subjective assessments can be misleading, they may provide useful additional 
information to supplement measures of performance. If an investigator wishes to study 
subjective reactions to environmental stresses, he has to use subjective assessments. Subjective 
assessments are also required to find out how much pain or discomfort a person is feeling, or the 
extent of a person’s well-being (Campbell, 1976). A doctor uses both subjective and objective 
measures when he makes a medical diagnosis partly upon a patient’s symptoms, and partly upon 
a physical examination and laboratory tests. Both kinds of information can be valuable to him, 
although he does not necessarily give them equal weight. 

No manager can afford to neglect subjective assessments. People may work rather more 
effectively in noise or in uncomfortable heat because they are more aroused. But there is the 
problem of getting people to come to work. Figure 13 (Senneck, 1975) shows that managers have 
not been particularly successful in this. 

In the mines as elsewhere, fatal and serious injuries have been reduced. But time taken off 
work for minor injuries and sickness is increasing. Work lost through strikes is always hitting the 
headlines. But over 20 times as much work is lost because people do not come to work, about 
400 million working days each year (Garnett, 1973). 

Vernon, Bedford & Warner (1928) showed that the environment at work influences people in 
their decision whether or not to come to work. Coal-miners working at hot damp coal faces were 
more likely to stay away from work for minor illnesses and accidents, than coal-miners who 
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Figure 13. Injuries and absences in mining. After Senneck (1975). 


worked in better conditions. Stress at work, whatever its cause, probably makes people less 


likely to come to work when they feel poorly. 


" 


One possible way of attracting people to work is to make the stress at work less than the 
stress at home. The relationship between the stress at work and at home can probably be 
determined only by asking people. But the effect of reducing the stress at work can be 
determined objectively, by seeing whether absenteeism is reduced. 

People can also be attracted to work by making the company of the other people at work at 
least as enjoyable as the company at home. In Belgium, which has very little absenteeism, the 
law requires that there be a bar in every factory, where wine, beer, brandy and vermouth are 
served (Yolles, Carone & Krinsky, 1975). Social breaks at work can then compete on favourable 
terms with social activities at home. They may help to compensate for the stresses at work 
which cannot be reduced. Clearly efficient managers need to be concerned with the subjective 
reactions of their employees. They can be used to supplement the objective measures which 
make an organization viable, like absenteeism and productivity. 


References 


ALLNUTT, M. F. & ALLAN, J. R. (1973). The effects 
of core temperature elevation and thermal 
sensation on performance. Ergonomics 16, 
189-196. 

(ASHRAE). AMERICAN SOCIETY OF HEATING, 
REFRIGERATING AND AIR-CONDITIONING 
ENGINEERS INC. (1972). Handbook of 
Fundamentals. New York: ASHRAE. 

ANDREWS, B. & Fincu, D. M. (1951). Truck-noise 
measurement. Proc. Highway Res. Board 31, 
456-465. 

BowsHER, J. M., JOHNSON, D. R. & RoBINSON, D 
W. (1966). A further experiment on judging the 
noisiness of aircraft in flight. Acustica 17, 
245-267. 

BROADBENT, D. E. (1950). The twenty dials test 
under quiet conditions. Medical Research Council, 
Applied Psychology Unit, Report 130, Cambridge. 

BROADBENT, D. E. (1953). Noise, paced 
performance and vigilance tasks. Br J. Psychol. 
44, 295-303. 


BROADBENT, D. E. (1954). Some effects of noise on 
visual performance. Q. Jl exp. Psychol. 6, 1-5. 

BROADBENT, D. E. (1957). Effects of noise on 
behavior. In C. M. Harris (ed.), Handbook of 
Noise Control, chapter 10. New York: 
McGraw-Hill. 

BROADBENT, D. E. (1958). Perception and 
Communication. London: Pergamon Press. 

Burns, W. (1973). Noise and Man, 2nd ed. London: 
Murray. 

CAMPBELL, A. (1976). Subjective measures of 
well-being. Am. Psychol. 31, 117-124. 

Daves, D. R. & TUNE, G. S. (1970). 
Human Vigilance Performance. London: Staples 
Press. 

ENGEN, T. & Levy, N. (1958). The influence of 
context on constant-sum loudness-judgments. Am. 
J. Psychol. 71, 731-736. 

GARNETT, J. (1973). The Work Challenge. London: 
Industrial Society. 


GENERALDIREKTION PTT, SWITZERLAND (1959). 
Forschungs und Versuchsanstalt. Rep. No. 22, 637 
and Appendix 

Harris, J. D. (1948). Discrimination of pitch: 
Suggestions toward method and procedure. Am. J. 
Psychol. 61, 309-322. 

HERBERT, M. (1974). The effects of disturbed sleep. 
PhD thesis, Cambridge University. 

HoLLINGWORTH, H. L. (1910). The central tendency 
of judgment. J. Phil. Psychol. scient. Meth. 7, 
461-469. 

Hopkinson, R. G. & CoLLins, J B. (1970). The 
Ergonomics of Lighting. London: Macdonald. 

INTERNATIONAL ORGANIZATION FOR 
STANDARDIZATION (1966). Method for calculating 
loudness level. ISO/R 532-1966 (E). 

INTERNATIONAL ORGANIZATION FOR 
STANDARDIZATION (1972). Guide for the 
evaluation of human exposure to whole-body 
vibration. Draft International Standard ISO/DIS 
2631. 

JovNSoN, R. B., NEWSON, L. J. & May, D. S. 
(1965). The limits of over-constancy. Q. Jl exp. 
Psychol. 17, 209-216. 

KNOWLES, W. B. (1967). Flight controllers for jet 
transports. Hum. Fact. 9, 305-320 

KNOWLES, W. B., BURGER, W. J., MITCHELL, M. 
B., HANIFAN, D. T. & WULFEcCK, J. W (1969). 
Models, measures, and judgments in system 
design Hum. Fact. 11, 577-590. 

Kryter, K. D. (1970). The Effects of Noise on Man. 
New York: Academic Press. 

McRoBERT, H., BRYAN, M. E. & TEMPEST, W. 
(1965). Magnitude estimation of loudness. J. 
Sound Vib. 2, 391-401. 

MILLER, J. D. (1974). Effects of noise on people. J. 
acoust. Soc. Am. 56, 729-764. 

MIRABELLA, A. & GOLDSTEIN, D. A (1967). The 
effects of ambient noise upon signal detection. 
Hum. Fact. 9, 277-284. 

MoskowrIZ, H. R., STEVENS, J.C & SCHARF, B. 
(1974). Sensation and Measurement: Papers in 
Honor of S. S. Stevens. Boston, Mass.: Reidel. 

PARDUCCI, A. (1963). Range-frequency compromise 
in judgment. Psychol. Monogr. 77, (2), no 565. 

PLUTCHIK, R. (1959). The effects of high intensity 
intermittent sound on performance, feeling and 
physiology. Psychol. Bull. 56, 133-151. 

PouLTon, E. C. (1967). Population norms of top 
sensory magnitudes and S. S. Stevens’ exponents. 
Percept. Psychophys. 2, 312-316. 

PoULTON, E. C. (1968). The new psychophysics: Six 
models for magnitude estimation. Psychol. Bull. 
69, 1-19 

PouLToN, E. C. (1969). Choice of first variables for 
single and repeated multiple estimates of loudness. 
J. exp. Psychol. 80, 249—253. 

PouLTOoN, E. C. (1970). Environment and Human 
Efficiency. Springfield, Il.: Thomas. 


Quantitative subjective assessments 425 


PouLTOoN, E. C. (1973). Unwanted range effects 
from using within-subject experimental designs. 
Psychol. Buil. 80, 113-121. 

PouLTOoN, E. C. (1974). Tracking Skill and Manual 
Control. New York: Academic Press 

PourToN, E. C. (1975). Range effects in 
experiments on people. Am. J. Psychol. 88, 3-32. 

PouLToN, E. C. (1976). Arousing environmental 
stresses can improve performance whatever 
people say. Aviat. Space Environ. Med. 47, 
1193-1204. 

POoULTON, E. C. (1977). Continuous loud norse 
masks auditory feedback and inner speech 
Psychol. Bull. (ın press). 

PoULTON, E. C., Hunt, J.C. R., MUMFORD, J. C. 
& PouLToN, J. (1975). The mechanical 
disturbance produced by steady and gusty winds 
of moderate strength: skilled performance and 
semantic assessments. Ergonomics 18, 651-673. 

RoBINSON, D. W., COPELAND, W.C & RENNIE, 
A. J. (1961). Motor vehicle noise measurement 
Engineer 211, 493-497. 

SENNECK, C R. (1975). Over-3-day absences and 
safety. Appl. Ergonomics 6, 147-153. 

STEVENS, S. S. (1955) The measurement of 
loudness. J. acoust. Soc. Am. 27, 815-829. 

STEVENS, S. S. (1962). In pursuit of the sensory law. 
Second Klopsteg lecture, Northwestern 
University. 

STEVENS, S. S. (1975). Psychophysics. Introduction 
to Its Perceptual, Neural and Social Prospects. 
New York: Wiley. 

TEGHTSOONIAN, R (1971). On the exponents of 
Stevens’ law and the constant in Ekman’s law. 
Psychol. Rev. 78, 71-80. 

TEMPEST, W., BRYAN, M. E. & McRoseat, H. 
(1965). The estimation of relative loudness Paper 
B16, 5° Congrés International d’Acoustique, 
Liége, Belgium. 

U.S.A. STANDARD (1968) Procedure for the 
computation of loudness of noise. USAS S3.4. 

VERNON, H. M., BEDFORD, T. & WARNER, C G 
(1928). A study of absenteeism in a group of ten 
collieries. Medical Research Council, Industrial 
Fatigue Research Board Report 51. London: 
HMSO. 

WARREN, R. M. (1958). A basis for judgments of 
sensory intensity. Am. J. Psychol. 71, 675-687. 

WARREN, R. M (1970). Elimination of biases 1n 
loudness judgments for tones. J. acoust. Soc Am. 
48, 1397-1403. 

WARREN, R. M. (1973a) Anomalous loudness 
function for speech. J. acoust. Soc. Am. 54, 
390-396. 

WARREN, R. M. (1973 b). Quantification of loudness. 
Am. J. Psychol. 86, 807-825. 

YoLLzS, S F., CARONE, P. A. & KRINSKY, L. W. 
(1975). Absenteeism in Industry. Springfield, Ill.: 
Thomas. 


Received I June 1976; revised version received 10 September 1976 


Requests for reprints should be addressed to Dr E. C. Poulton, MRC Applied Psychology Unit, 15 Chaucer 


Road, Cambridge CB2 2EF. 


Br. J. Psychol. (1977), 68, 427-429 Printed in Great Britain 427 


Precautions in experiments on noise 


Donald E. Broadbent 


As always, Poulton (1977) raises a number of excellent points in his Myers lecture. In one area, 
however, he makes unwarranted statements of fact, which cannot be left without challenge. The 
section in question concerns the effects of noise on performance. 

Poulton claims that, in a vigilance task used by myself (Broadbent, 1954), there was an audible 
cue from the apparatus recording a response; and that in quiet this cue resolved uncertainty 
about the direction of the response. This statement is almost certainly false, because: 

(a) The same stimulus and response equipment showed no effect of noise when signals 
occurred in central fixation (i.e. if the pointer moved when the subject was looking at it). 

(b) In the original paper, a different stimulus and response device again showed deterioration 
on some parts of the display but not on others. 

(c) The effect has been repeated by several workers using other techniques; and also by 
myself. One study, for example, required key-press responses to each non-signal event as well as 
to signals. This ensured that the deterioration obtained was not due to the response system 
(Broadbent & Gregory, 1965). 

(d) In the original study, all responses were identical, so that subjects had no doubt about the 
correct action once they had seen the signal. This is conveniently shown by the arrows on the 
knobs in Poulton’s Fig. 11. Subjects were practised for an hour and a half session before any 
data were collected; they were in no doubt about the direction of response. 

(e) Had subjects ‘twiddled the control’, as Poulton suggests, they would have broken the 
equipment and this would have been detected (Broadbent, 1976). In addition of course the 
apparatus aroused much interest at the time, yet no visitor ever reported any auditory cue. This 
does not rule out subliminal cues, and hence the foregoing analyses were made. 

Poulton also claims that, in a serial response task (Broadbent, 1953), the deterioration was due 
to a difference of pitch occurring if the response stylus missed the recording contacts. In this 
task it is certainly true that each scored response (error or correct) made a sound which would 
not occur if no response was made. However, Poulton’s speculation is very highly improbable 
because: 

(f) The original paper showed a noise effect only on the number of scored errors (contacts 
touched which did not correspond to the stimulus given). There was no increased frequency of 
gaps without a recorded response, as if the contact had been missed. Recorded error and correct 
responses were all made on the same contacts, and therefore produced identical sounds. 

(g) The noise need not be present at all during performance of this task; a subject who simply 
reads in noise gives a deterioration in subsequent performance of the serial response task in quiet 
(Hartley, 1973). No apparatus cues can be relevant in that case. 

(h) The distribution of error latencies in noise is the same as that in quiet (Hartley, 1973). If 
acoustic cues were in some indirect way responsible for errors, the latency of the latter would 
be different in noise. 

(i) Wearing ear defenders reduces the difference between moderate and loud noises, whereas 
delivering noise through earphones produces at least as great a difference (Hartley, 1974; Hartley 
& Carpenter, 1974). On a theory of acoustic cues, the results should be reversed. 

Of the foregoing facts, (a), (b), (d), and (f) were all emphasized in the original papers, while 
(c), (g), (A), and (i) are in papers which Poulton has cited as references on other occasions. The 
relevance of all these arguments for Poulton’s views, as well as a larger array of evidence, has + 
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been published by Broadbent (1976). Poulton’s suggestion that the matter has not been 
adequately handled in the literature is therefore more than a little puzzling. 

He is again factually incorrect in claiming that my reviews and papers of the 1950s put 
forward a ‘rule’ that noise gives a detrimental effect, and in claiming that ‘it has recently 
become clear’ that stress sometimes has a neutral or beneficial effect on performance. Both the 
original papers emphasized that some parts of each task deteriorated in noise while others were 
unaffected or improve (Broadbent, 1953, p. 297, lines 26-32; 1954, p. 1, bottom three lines, p. 4, 
second paragraph); so do all my later reviews. That fact excludes a large number of otherwise 
plausible explanations, of which apparatus artifacts are only one. Indeed, when Poulton (1976) 
quotes 19 experiments in which noise improves performance, over 40 per cent of them also 
show deterioration measured on the same apparatus in some other way. (Usually, on a different 
part of the stimulus display, or with a different signal probability.) This fact alone appears 
difficult to harmonize with Poulton’s statement that ‘all the experiments in which continuous 
noise is found to degrade performance can be explained by masking’. 

Such a statement is particularly regrettable, because it obscures some of the most interesting 
aspects of the present position. We now know that improvements due to arousal (by financial 
incentives, or other causes, as well as noise) are often at the expense of deterioration elsewhere 
in the task. Secondly, we now know that decisions in vigilance tasks frequently show a general 
increase of confidence in noise, which is helpful when it applies to correct detections but 
harmful when it applies to errors. That is why the pattern of deterioration takes the form found 
earlier. Thirdly, there are now a large number of experiments involving noise which show effects 
persisting after exposure, and on tasks never met in noise (Glass & Singer, 1972; Maier & 
Seligman, 1976). These three effects cannot by their nature be due to apparatus cues; they 
confirm but also long outdate my papers of 25 years ago. The true current issue is whether all 
three effects are due to the same underlying mechanism, or whether they are separate. My own 
feeling at present is that the first two effects may be linked, but that some of the carryover 
effects are due to a different mechanism. Some, for example, involve perceived control of the 
noise rather than sheer intensity (e.g. Wohlwill, Nasar, De Joy & Foruzani, 1976), while the 
other effects are intensity dependent. 

It is wise to try and look for the positive value in any ideas from so creative and enthusiastic 
an investigator as Poulton; even if in this case his detailed suggestions are completely without 
foundation. Conceivably some of the carryover effects (though not the others) might be due to 
reduced ability to vary the environment by whistling, foot-tapping, tooth-sucking and so on. The 
controls used in existing experiments would not exclude such a mechanism, as they were 
directed at the apparatus itself. Such a possibility is certainly worth experiment; it is bad luck in 
this case that a distinguished investigator should have based a view on errors and omissions in 
his reading of the literature. Such errors happen to all of us; there is a major one in Broadbent 
(1976), but as it makes that paper more favourable to Poulton than a correct reference would 
have been, it can perhaps stand uncorrected! At all events, we owe a great debt to Poulton for 
his contribution on other subjects (see for example, Broadbent, 1977). Precisely because of his 
eminence, this episode may encourage readers to examine the original literature before believing 
the accounts given in secondary sources; including this one. 
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Trying to bridge some neuropsychological gaps between monkey and man 
L. Weiskrantz 





It is puzzling that some gross anatomical systems in the brains of monkeys and men appear to function quite 
differently although there is no established basis for this either in their fine-grained anatomical organizations 
or in the inherent behavioural capacities of the two species. It is suggested that in some instances the 
discrepancies may arise because inappropriate tests have been used with the animals, and examples are given 
of positive evidence for cross-modal perception and a possible experimental basis for hemispheric 
specialization in the monkey. In other instances, they may derive from inadequate or inappropriate methods 
of testing human subjects, and attention is focused on two major examples: memory disorders associated 
with medial temporal lobe lesions and blindness associated with occipital lesions in man. In both examples it 
has been generally concluded that the human deficits are far more severe and even qualitatively different 
from those studied in the monkey. Evidence suggests that the discrepancies may be resolved if human 
subjects are tested not by means of ‘commentary’ questions (e.g. ‘do you see this?’ or ‘do you recognize 
this?") but by methods that depend upon forced-choice or identification procedures that are more closely 
related to those used with animal subjects. It is argued that the study of dissociations between a capacity 
and its acknowledgement by a human subject may suggest a type of brain organization that is consistent both 
with the engineering approach that Craik would have fostered and also with one that places older and newer 
brain structures in a single evolutionary framework. 





No one would dispute, I suppose, that what we do is to some degree controlled by what is 
happening in our brains, and that what happens there is partly determined by its structure, its 
lifetime history, and its current state. Those of us who try to study how the brain controls 
behaviour in monkeys do so for a variety of reasons, but undoubtedly one important reason is 
that we hope to illuminate brain function:in man, in whom everyday life imposes natural 
experiments by accidents and disease in great profusion but with poor control. Neuropsychology 
attempts to relate brain dysfunction to other knowledge of brain organization obtained through 
anatomy, physiology, and pharmacology, and all of these in a convergent manner to the study of 
behaviour. The simian brain, in its gross anatomical aspects, closely resembles the human brain. 
Admittedly, it is smaller in relation to body weight than the human brain, it is less convoluted 
and the neocortex is smaller in relation to primitive brain stem structures than is our neocortex, 
or indeed even that of the great apes. But there is no good reason to think that these are other 
than quantitative differences; or at least there is no good basis on which to translate these 
quantitative aspects into qualitative discontinuities. Admittedly, man is not just a monkey writ 
large - we have spoken language and a range of cognitive capacities that no one has as yet 
demonstrated in monkeys; in turn some monkeys are equipped, undoubtedly for good 
evolutionary reasons, with a motor system that in some ways is more highly skilled than ours. 
But it must be disquieting to learn that certain gross anatomical systems in monkey and human 
brains apparently function quite differently, when there is no detectable basis for this either in 
their fine-grained anatomical organizations or in the inherent behavioural capacities of the two _ 
species. When examples of this occur, and I shall soon dwell on a few major examples, there 
have been two types of intellectual response. Historically, the older one — and it goes back at 
least to William James - concluded that the higher the creature is on some phylogenetic ladder or 
tree, the more greedy becomes the cortex in taking over the control of skills previously 
controlled by older subcortical structures. This is the doctrine of encephalization of function for 
which, in fact, detailed evidence is not very strong (Weiskrantz, 1961). It is in some ways a 
strange doctrine if only because it violates the Peter Principle. And while imperialism between 
brains may have evolutionary value, it is a bit puzzling to see why imperialism within brains 
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should be useful. At any rate, what happens functionally to the older pathways? More recently 
there has been a more despairing response, although it is sometimes embraced enthusiastically 
by those who like to preserve puzzles or stress the uniqueness of individual species; it simply 
asserts that anatomical similarity is not a reliable predictor of functional similarity, without 
offering any other principle by which one might join functicnal dissimilarities together. After all 
these years I am loathe to conclude that we have been climbing up the wrong tree, evolutionarily 
speaking, or that we ought not even to be climbing trees, but leaping from one kind of tree to 
another at random. I want to consider first whether the fault may lie with our methods before 
concluding that the fault lies with our brains. 

Of course, we have learned in recent years not to make easy assumptions about the lack of 
particular capacities in fellow primates, as the examples of the Gardners’ work on sign language 
or Premack’s work on token language in chimpanzees ough? to remind us. And some of the 
assumptions of functional differences between monkeys and man really have not received 
careful experimental analysis. Or the empirical examination may simply not have put the 
question to the animal in a way that was meaningful or worth the animal’s effort. After all, in 
many experiments by disdainfully ignoring the experimenter’s problem the creature will 
nevertheless get his peanut on 50 per cent of the trials just by behaving randomly, which are 
rather better odds than experimenters themselves find acceptable in conducting research. 
Perhaps this is the explanation of the long-standing failure to demonstrate cross-modal matching 
in monkeys. Humans can obviously match shapes across modalities - you can touch a sphere in 
darkness and then select it by vision from an array of objects. Because with conventional testing 
monkeys fail such a test, it has been assumed that such a capacity might be exclusively human 
and depend upon some internal — perhaps linguistic — mediation. We now know from the work of 
my colleague Peter Bryant that six month old human infants can carry out cross-modal matching 
(Bryant, Jones, Claxton & Perkins, 1972), and from the work of Davenport, Rogers and Russell 
that chimpanzees can also succeed (Davenport, Rogers & Russell, 1973) but the monkey has 
pretty consistently been denied such a capacity. Cowey and I wondered if perhaps the monkey 
was missing the point of the experiment, and we speculated that in natural circumstances a 
monkey might indeed find it important and useful to relate information about food in one 
modality to information in another. So we prepared lab chow itself into different shapes (Fig. 1), 
but some of the food shapes were laced with a strong quinine solution and a hefty dose of sand, 
whereas others were left nice and unadulterated. The monkeys were first offered both sorts of 
shapes in darkness, and then offered a choice to be made visually in the light. There were a 
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Figure 1. Outline diagrams of five pairs of shapes used in the cross-modal experiment, as viewed from 45° 
above their horizontal surface. Their size can be judged from shape 2A, which was 4-5 cm long. (Reprinted 
by permission of Pergamon Press.) ^ 
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variety of the usual control tests for artifacts of various sorts, which I will not go into here. The 
result (Cowey & Weiskrantz, 1975) was that the monkeys had no great difficulty in using their 
experience in darkness to select the good tasting shape in the light. Indeed, their performance 
with one version of the test embarrassingly has turned out to be better than any recorded in the 
literature either for the chimpanzee or the young human child (Weiskrantz & Cowey, 1975). It is 
not just Oxford monkeys who are so gifted — in London and in Cambridge monkeys are also able 
to do it, and indeed Dr Iversen and her colleagues, Mr Sahgal and Mr Petrides, already have 
published some interesting results (Sahgal, Petrides & Iversen, 1975) as to which cortical 
structures may be involved critically in allowing such cross-modal matching to occur in the 
monkey. 

There is another assumption about monkeys that is widely accepted and is rather more 
fundamental, based on a limited history of negative evidence. In man we know very well that the 
two cerebral hemispheres are to some extent functionally specialized. Damage to certain regions 
in the left hemisphere produces mainly language disorders and damage to the right mainly 
perceptual spatial disorders. The conclusion is also reinforced by the work of Sperry and others 
on patients with the inter-hemispheric commissures cut, thereby allowing inputs to be directed 
exclusively to one hemisphere or to the other. It is generally assumed that in non-human 
creatures, in contrast, the cerebral hemispheres are symmetrically organized, and that for 
disruption of cognitive skills it is necessary to interfere with corresponding regions in both 
hemispheres. Some years ago James Dewson of Stanford University, Cowey and I (1970) 
wondered whether the appropriate task had been studied in monkeys. We speculated that 
different results might be obtained if a task were used that depended on a capacity that must be 
a prerequisite for appreciating spoken language, namely the discrimination of temporal order of 
sounds. We taught monkeys, in effect, to play a two-key piano, one key producing a pure tone 
and the other a white noise burst. Whenever a particular pair of sounds was presented to the 
monkey by a programming apparatus, any particular pair being unpredictable from trial to trial, 
the animal received his peanut only if he could produce the sequence of sounds by pressing his 
two keys in the appropriate order. Our monkeys learned the task. We then found that a small 
lesion in either hemisphere in the region thought to be homologous with Wernicke's speech area in 
man produced a permanent impairment in performance. This already demonstrated, surprisingly, 
that a unilateral lesion was sufficient to interfere with this type of task, but there was no clear 
difference in those early results between right and left hemisphere lesions. Since then, Dewson 
back at Stanford has developed this type of task in an interesting and in some respects simpler 
way — by requiring the animal to press a key with an appropriate colour rather than in a 
particular spatial location, in other words with spatial coding removed from the task (Dewson & 
Burlingame, 1975). He recently reported (Dewson et al. 1975) that he finds now that only left 
hemisphere lesions interfere with this task, and right hemisphere lesions are without effect. This 
important result must be considered as preliminary and requiring replication and elaboration, but 
it points to the danger of making too easy assumptions about brain organization between monkey 
and man when the task may not have been selected appropriately. One can even suggest that 
there has been hardly any relevant work on the monkey that allows us to conclude that 
hemispheric specialization does not exist in this creature. 

These two examples, which I offer in passing, are instances of assumptions based on what 
may be inadequate or inappropriate methods of testing the monkey. But I now want to turn to 
and dwell on at slightly greater length two examples of discrepancies which may derive from 
inadequate or inappropriate methods of testing human subjects. There is now very considerable 
clinical experience, in both examples, of extremely severe and enduring impairments caused by 
disruption of certain neural systems in man, whereas the monkey is relatively unimpaired after 
interruptions of apparently the same circuits. It is not surprising that these discrepancies have 
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been seized upon as being fundamental, because generally it is much easier to study a human 
subject, with whom we have considerable familiarity and the advantage of linguistic intervention, 
than it is to train animals. When the human deficit remains severe and stable with intensive 
testing, whereas the animal deficit is relatively slight, we are indeed inclined to assume that these 
structures are functionally more important and perhaps qualitatively different in man than in 
monkey. I want to concentrate on perhaps the two most striking and most intensively studied 
discrepancies of this sort, and to suggest that we may be in sight of resolving them but that in so 
doing some new and surprising issues arise. 

The first example is of the severe and striking memory disorders that are associated with 
bilateral medial temporal lobe damage in man. The most famous instance of this was the patient 
H.M., who was operated on by Scoville in 1953 for the relief of epilepsy, and who was studied 
over many years by Milner and her associates (Scoville & Milner, 1957; Milner, Corkin & 
Teuber, 1968). H.M. and other surgical patients or patients with diseases of the nervous system 
thought to damage either the medial temporal lobe, especially the hippocampus, or anatomically 
closely related structures, appear to have disastrously poor memory for everyday events, even 
though they need not be intellectually impaired in other ways so long as they are not required to 
bridge a gap without rehearsal of more than a few minutes. They will appear, for example, not 
to remember even the experimenter or physician from day to day even though they have seen 
him or her scores of times, although older skills and strongly established memories from before 
the time of the lesion may sometimes survive. The impairment is permanent and severely 
crippling. When examining these patients clinically it is almost irresistible to conclude that they 
are suffering from a blockade of input into long-term memory, which can be located on a 
conventional two-stage flow diagram, which was how Milner originally characterized the defect. 

The interesting theoretical implications were such that when the surgical cases were reported 
in the 1950s by Scoville & Penfield, not surprisingly there was a rash of animal experimentation 
attempting ‘to pinpoint and analyse the critical circuitry more carefully than can be done in man. 
So far as a memory impairment of the blockade type is concerned, these animal experiments have 
been a dismal failure, although there have been a few false alarms from time to time. By now 
the published papers on hippocampal lesions in animals are extremely numerous. While 
anterograde amnesia is not one of the impairments (indeed such animals actually learn some 
tasks faster than controls) it is indeed possible to discern certain strong threads in this literature. 
The animals appear to be unwilling to relinquish the effects of earlier learning, which makes it 
clear that there is nothing wrong with their capacity to store information over long intervals. 
Thus, they take longer to extinguish a previously rewarded response, they cling to a particular 
discrimination abnormally long when required to shift to a reversal of that discrimination 
(especially if the task is a spatial one), and in learning mazes they adhere longer to a particular 
strategy before trying out a new one. If they learn successive tasks, the first one is more likely 
to interfere with the second than in control animals. We are now being increasingly led to 
believe that the hippocampus is not a functionally unitary structure, and so we are probably being 
confronted with a variety of deficits in the literature, but at least one major class of results 
appears to reflect a difficulty in controlling and restraining the influence of prior learning on 
present performance (cf. Weiskrantz, 1971; Douglas, 1975; Kimble, 1968, 1975; Weiskrantz & 
Warrington, 1975). 

After spending several years of tantalizing but frustrating effort to isolate a blockade type of 
long-term memory impairment in monkeys, it seemed worth having a new look at the long-term 
human memory deficit and Dr Elizabeth Warrington of the National Hospital and I set out to do 
so about nine years ago. Our first efforts were designed to replicate one of our monkey findings, 
the details of which need not concern us now. At any rate, it was a devastating failure in its 
main aim, so incapable seemed our amnesic patients to learn anything at all even after several 
repetitions of simple lists of words. But we noticed, as we went on with our recall tests, that 
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increasingly the wrong responses seemed familiar to us — and indeed it emerged on analysis that 
up to half of them were words from earlier lists that we had presented (Warrington & 
Weiskrantz, 1968 a). It appeared that the words perhaps were being stored but were emerging at 
the wrong time as false positive responses. At about that time, Dr Warrington suggested that we 
try a kind of memory test that did not involve explicitly asking the subjects to remember 
anything as such, but instead to ask them simply to identify pictures or words from which we 
might infer that memories had been established. For this purpose she adapted a test of 
perception designed by Gollin (1960) so that it became a memory test (Warrington & Weiskrantz 
1968 b). How it works is illustrated in Fig. 2. The subjects are shown very fragmented pictures 
or words. If they cannot guess the object or the word, they are shown the next most fragmented 
version, and so forth, until correct identification is achieved. When the procedure is repeated 
several times, normal subjects get better and better at identifying the items in a fragmented 
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Figure 2. Examples of fragmented picture and word. The subjects were first shown the most fragmented 
version, then successively less fragmented versions until correct identification was achieved. (Reprinted by 
permission of Macmillan & Co.) 
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form, and the improvement is maintained over several days. The identification is specific to 
the items to which they had been exposed, i.e. their learning and retention is not just a 
non-specific practice effect. , 

To our surprise the amnesic patients could learn and retain information from day to day. We 
also went on to show that it was not the perceptual peculiarities of this material that were 
essential — the first few letters of a word, or a semantic hint, also produced good retention. (All 
of these methods I shall label here as cued recall methods.) In fact we found that it did not 
matter whether such cues were used to instil learning - whole words could be used just as in a 
conventional learning procedure, but what was crucial was that the cues be given at the time of 
retrieval of the material (cf. Warrington & Weiskrantz, 1973, Weiskrantz & Warrington, 1975, 
for review). This type of result has proved to be quite robust and with it one can demonstrate 
retention by amnesic patients over long periods. In fact, using this kind of technique Milner 
(1970) reported that H.M. showed retention of pictorial material over a four-month period. Nor 
does it appear to be the case, as was suggested by others, that the amnesic patient is just 
showing degraded normal memory for which our procedure provides a set of crutches: 
amnesic patients are differentially aided by cued recall methods compared with control subjects 
(Fig. 3). 
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Figure 3. Number of correct responses when retention was tested by cueing with the initial three letters of 
words and by yes/no recognition. The interval between initial presentation of the whole words and retention 
testing was 10 min. (Reprinted by permission of Pergamon Press.) 


How could such surprising evidence for storage be allowed to shine through in a patient who 
is apparently unable to retain information for more than a few minutes with other methods? 
Suppose, for whatever reason, that the amnesic patients do have a real problem with intrusions 
by inappropriate items in memory or their irrelevant memories are blocked by such congestion. 
Whatever else it does, partial information, i.e. cues, limits the range of false positive responses. 
If I ask you what words start with Por you will have to be very contrary to say ELEPHANT. In 
fact, the English language varies markedly in the extent to which the first three letters of a word 
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will preclude alternative words. The letters PRE preclude far fewer words than the letters ont, 
which in fact will match only a single common English word. Indeed, there cannot be a false 
positive response to YAC or ONI - you either know the one common word that goes with each of 
these triplets or you do not, unless you are a poor speller (YACHT and ONION). Other triplets will 
match only two common words in English, such as MoA and COT (MOAN, MOAT: COTTON, 
COTTAGE). This convenient property allows us to see whether the degree of retention by amnesic 
patients varies as a function of the number of incorrect words that can be excluded by the cue, 
and it also allows us to set up experiments deliberately designed to produce excessive 
opportunities for interference by intrusions. Warrington and I have carried out a number of 
experiments that have produced a coherent picture. For example, if unique cues are given, such 
as ONI Or YAC, amnesic subjects show normal retention of the whole words to which they were 
exposed 24 hours before, and perhaps even better than normal retention. If cues are given that 
match several words, their performance deteriorates more severely than does that of control 
subjects. If a cue is given that matches only two words in English, the amnesic subject will 
show excellent retention if he has been given a list of only one of these two alternatives to learn. 
When, however, he is then shown the alternative set to learn, the same cues can now evoke 
either word from storage and he performs very poorly. That is, he goes on producing the original 
word for much longer than the control subjects (Fig. 4). This, in fact, may be a human verbal 
analogue of discrimination reversal tests in animals (cf. Warrington & Weiskrantz, 1973, and 
Weiskrantz & Warrington, 1975). 
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Figure 4. ‘Reversal’ learning. For each recall cue (the initial three letters) there were only two common 
English words available as possible responses. Subjects were first taught one set of words (list 1) and then 
were given four trials with the alternative set (Rizs). The same cues were used on all five trials. (Reprinted 
by permission of Pergamon Press.) 


One more example can be given of the beneficial potency of constraining false positive 
responses for the amnesic patient, and of the disastrous effects of deliberately designing lack of 
constraint into the experiment. It is instructive because it uses material that is usually found to 
be almost impossible for amnesic subjects to remember, namely so-called paired associates. This 
was apparently first demonstrated with patients by Wechsler (1917), although MacCurdy in 
Cambridge in 1928 claimed the credit for suggesting it to Wechsler. The task is to present 12 


438 L. Weiskrantz 


pairs of words, one pair at a time. Some time later the subject is shown the first word of each 
and asked to produce the second. As conventionally constructed, the pairs are unrelated, and in 
this case the amnesic subject cannot remember more than one or two out of a list of 12 pairs. 
My colleague, Gordon Winocur of Trent University, showed that if the pairs are constrained by 
some implicit rule, either a semantic or a phonetic one, amnesic subjects show excellent 
retention (Winocur & Weiskrantz, 1976). If, next, the first word in each pair is paired with 
another word that follows the same rule, the amnesic subject continues to perseverate with the 
response he first learned. An example would be PEACE-CALM followed by a list with say, 
PEACE-QUIET. Using a phonetic constraint, the example might be PEACE-NIECE, PEACE-GEESE 
(Figs 5 and 6). On the other hand, changing the rule from semantic to phonetic or vice versa 
helps the amnesic subject to learn the second list. So we see that we can go through the whole 
range of normal retention to hopelessly poor retention depending on how we design our material. 
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Figure 5. Error patterns in learning semantically related paired associates. The initial set of 12 pairs was 
presented four times, followed by a recall test 1 min after the fourth presentation. Subjects were then given 
an alternative set of semantically related pairs in which the initial members of the pairs were the same as in 
the first list. Retention was tested after each of the learning trials for the second list. ‘Direct intrusions’ refer 
to responses appropriate to the first list that were offered by subjects during the list 2 retention trials. 
(Reprinted by permission of Pergamon Press.) 


There is more than one possible theoretical interpretation of findings such as these, and the 
current stage of analysis is trying to tease them apart (cf. Weiskrantz & Warrington, 1975). Here 
I do not wish to go into such issues. My main point is that whatever theory captures support in 
the end, it is one that can apply equally easily both to the monkey and to man. In both species 
there is an interference of current retention from past experience when the cue is insufficient to 
constrain such interference. We leave open the question as to why interference itself should be 
such a problem when there is hippocampal damage. But there is no longer any embarrassment 
over being unable to find a blockade of input into long-term memory in the animal, because there 
is no such blockade in man either. 

My second major example concerns a dilemma that has been with us for much longer: it 
concerns the function of the visual cortex in man and monkeys. It is in fact about 90 years since 
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Figure 6. Conditions identical to those pertaining to Fig. 5, except that the pairs of words in lists 1 and 2 
were rhymes. As in Fig. 5, the first members of the pairs were common to both lists, but the second 
members were different. (Reprinted by permission of Pergamon Press.) 


Ferrier first carried out his pioneering work on the visual cortex of monkeys and the problem 
has been with us ever since. In both man and monkey, the majority of nerve fibres leaving the 
eye first travel to the lateral geniculate nucleus of the thalamus. In both species this nucleus 
sends fibres only to the striate cortex (area 17). In man, damage to this cortex, it has been 
agreed for more than half a century, produces severe blindness, or at best only rudimentary 
visual responses to vigorous movement or abrupt changes in illumination. In contrast, the more 
the monkey without striate cortex has been studied, the more remarkable has been the visual 
capacity that has been revealed (cf. Weiskrantz, 1972, for review). At each stage we have been 
over-conservative. Kliiver (1942) argued that discriminations of flux could be made. We showed 
in 1963 that discriminations of total contour length could be made even when flux was equated. 
But later Humphrey showed (1970) that, using a simple but ingenious technique, many subtle 
discriminations could be carried out, and many of you will know of the results of his tutelage of 
Helen. Later the Pasiks claimed that even discriminations between simple patterns equated for 
flux, contour length, and probably also for distinctiveness, were possible, again using a 
specialized training technique (Pasik & Pasik, 1971). More recently, we have shown that the 
ability of the animal to detect and locate visual events cannot depend on any complex scanning 
strategy of head or eye movements, because the spatial differentiation is good even with flashes 
of light of shorter duration than the latency with which saccadic eye movements are elicited 
(Weiskrantz, Cowey & Darlington, 1974). These types of findings have evoked much speculation 
and some research on the capacity of the phylogenetically older visual pathways to the midbrain 
and elsewhere that are still intact after interruption of the geniculo-striate route, and have led to a 
variety of theories of the so-called ‘second visual system’. This midbrain pathway perhaps really 
deserves the title of the ‘first’ visual system in terms of its evolutionary history. 

Now while cases of extensive damage to the occipital lobes in man are fairly common, on 
further reflexion it can be argued that exactly comparable instances of damage largely restricted 
to striate cortex as has been studied in the monkey are relatively rare because of the anatomical 
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disposition of the cortex in man. This has turned out to be important because ' 
by Pasik & Pasik (1971) suggests that, in the monkey, if the surrounding tissue 
(this tissue receives an input from the superior colliculus in the midbrain via tł 
monkeys are less able to carry out such a wide range of visual discriminations 
have been on the look-out for human cases with damage largely restricted to tl 
Such a case came to our attention at the National Hospital after the neuro-opt 
Sanders, noticed a few interesting features in clinical examination. He, Dr Wa 
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Marshall and I then turned to a more detailed experimental examination (Sanders, Warrington, 
Marshall & Weiskrantz, 1974; Weiskrantz, Warrington, Sanders & Marshall, 1974). 

The patient in question had a small tumour removed from his right calcarine fissure, and thus 
his impairment lay in the left half fields of both eyes. On routine clinical testing he appeared 
densely blind in his half field, even with intense lights, except for a small crescent of fuzzy 
vision in the upper peripheral part of the field. Accordingly all of our testing was in the lower 
quadrant. He is a very cooperative subject and we asked him to respond as we asked our 
monkeys to respond - by reaching out and touching a screen on which visual stimuli were 
projected in his blind fields. This, at first, was a very odd task — how can you reach out for 
something you cannot see? Nevertheless, it quickly became apparent to us that his ability was 
remarkable (Fig. 7), and even when eventually his own results were shown to him at the end of 
several hours of testing, he was openly astonished. He thought he was just guessing. Later he 
described 'feelings' that there might be something there, but he consistently refused to call this 
‘seeing’. We went on to require him to ‘guess’ the orientations of lines — whether horizontal or 
vertical, or vertical or diagonal, and again his performance was remarkable. We also showed that 
he could carry out simple form discriminations, if required to guess between two alternatives 
such as ‘X’ or ‘O’, provided the stimuli were large enough (Fig. 8). Never did we present 
knowledge of results until the end of an experiment, which might last several days. And never 
would he acknowledge seeing, although he was very quick to acknowledge this the moment 
even a faint stimulus appeared on the intact edge of the good field. We were even able to 
measure his visual acuity in the blind field by having him guess in a forced-choice manner 
whether lines were or were not present in a grating. Though he denied seeing even the very 
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Figure 8. Performance in discriminating (by ‘guessing’) X vs. O and vertical vs. horizontal lines as a function 
of the size of stimuli. Size scale logarithmic. Maximum score in all cases is 30; chance performance is 15. 
(Reprinted by permission of Macmillan & Co.) 
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bright aperture in which the lines appeared, he nevertheless could discriminate fine lines of less 
than 2 minutes of arc from a homogeneous background matched in flux (Weiskrantz et al. 1974). 

We have termed this type of capacity ‘blind-sight’. Fortunately it is not restricted to just this 
one case. Evidence with eye movements correlated with the position of ‘unseen’ stimuli in field 
defects caused by bullet wounds in soldiers had been reported earlier from M.I.T. (Póppel, Held 
& Frost, 1973) and similar positive results to ours have been reported by French workers in 
Lyons. Dr Warrington and I have studied the phenomenon rather more intensively than can be 
reviewed here, especially in trying to determine whether blind-sight differs qualitatively in its 
psychophysics from normal vision. But the point I wish to make here is that, in terms of purely 
behavioural measures of forced-choice discrimination, carried out much as we would carry out 
tests in the monkey, the 'blind' field of the patient has very similar properties to those shown to 
exist in the monkey after striate cortex removal. If anything, the human subject has a more 
impressive residual repertoire than the monkey. And yet when tested by routine methods, the 
patient's field is blind. I believe, as a matter of fact, that the whole domain of visual field defects 
associated with brain damage, which has been thought to be more or less a closed book since 
Gordon Holmes's classical studies during the First World War, must now be studied almost from 
scratch. The properties of the older midbrain system may simply be left untapped by asking the 
subject what I shall term a ‘commentary’ type of question rather than requiring a discriminative 
response independently of the commentary. It will be interesting to see whether in man the 
integrity of the surrounding tissues is also important, and if so the tests will have obvious 
diagnostic value. A further interesting question is whether we can retrain the human subject to 
use his residual capacity. We know that in the monkey active retraining rather than passively 
imposed stimulation is of crucial importance after striate cortex lesions (Cowey, 1967). The 
evidence of Robert Wurtz, of N.I.H. (personal communication) is especially striking, because he 
has shown that if only part of the visual field defect is exercised visually in the training situation, 
then only that part of the field defect recovers in the monkey. Whether a similar exploitation of 
a residual capacity can be effected in human patients blinded by cortical brain damage remains 
to be seen. 

The two examples I have focused upon in the fields of memory and vision have one feature in 
common, namely that the previous limitations lay not with our animal testing methods, but with 
our tests of human subjects. In both instances, but especially in the case of visual field defects, 
the study of the monkey brain has led fairly directly to an extension of knowledge of the 
human brain and in both cases the results were surprising in terms of clinical expectations and 
inquiry. 

It would be sufficient for our present purposes if these were the only features that these two 
examples had in common. But it is possible that they also share another property that may tell 
us something more general about other cortical mechanisms. In both types of defect, the 
surprising degree of residual function was revealed not so much by asking the human subjects a 
particular type of question, but by resisting the temptation to ask him a commonplace question, 
such as ‘tell me when you see something’, ‘what do you see?’, or ‘tell me whether you can 
remember something’ ‘what is it you remember?’. In fact, the only persons who remain 
firmly unconvinced of the results are not so much fellow investigators but the subjects 
themselves. I have already mentioned the surprise displayed by our ‘blind-sight’ patient when 
shown his results. But a similar lack of acknowledgement appears to be present in amnesic 
patients as well. The flavour of this is best conveyed by an anecdote published in 1911 by the 
Swiss psychologist, Claparéde, about an alcoholic Korsakoff patient who was amnesic. 


.. -I tried the following experiment. . .to see if she would better retain an intense impression that set 
affectivity into play. I pricked her hand forcibly with a pin hidden between my fingers. This little pain was 
as quickly forgotten as indifferent perceptions and, shortly after the pricking, she remembered no more of it. 
However, when I moved my hand near hers again, she pulled her hand back in a reflex way and without 
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knowing why. If, in fact, I demanded the reason for the withdrawal of her hand, she answered in a flurried 
way, ‘Isn’t it allowable to withdraw one’s hand?'. . .If I insisted, she would say to me, ‘Perhaps there is a 
pin hidden in your hand’. To my question ‘What can make you suspect that I would like to prick you’, she 
would take up her refrain, ‘It is an idea which came into my head’, or sometimes she would try to justify 
herself with, ‘Sometimes pins are hidden in hands’. But she never recognised this idea of pricking as a 
memory. (Quoted by MacCurdy, 1928). 


This failure of the amnesic patient to acknowledge his memories as his own was commented on 
also by MacCurdy (1928). You will have noticed that our memory techniques that revealed such 
good retention avoided the question ‘tell me what you remember’. Instead a cue was given that 
required a response that happened to depend upon previous learning. We have often had the 
experience of asking an amnesic subject to tell us the words that he was just shown and have 
received the response ‘what words?’ - followed by a remarkably good performance with cueing 
that revealed the integrity of his learning. 

A question that we accept as commonplace, such as ‘what do you see’ or ‘what do you 
remember’ may turn out to be rather more complicated and revealing than we suspected. It is 
what I have just termed a commentary question. It is probably the most common type of 
question we ask fellow humans. In contrast, we never ask animals such questions although in 
principle I think we could. It might be the case that these questions arise in just those situations, 
which are themselves commonplace, when we are exposed to events on which we are 
characteristically able to carry out cognitive operations, or more colloquially we are able to 
think: that is, to classify, to order, to rehearse, to imagine, perhaps to carry out just those 
operations that Craik (1952) himself drew attention to when we build up a model of the external 
world, to use his phrase. When we deal with stable routines such a commentary may be 
superfluous and indeed may be impossible. We cannot offer commentaries on our own visual 
routines involved in accommodation, pupillary adjustments, many of our eye movements, or 
even many complex acquired skills such as playing squash. You will all remember Potter's ploy 
of asking a competitor at golf, with feigned admiration, to describe in detail exactly which 
muscles he uses in precise sequence when he swings his club. Craik himself drew attention to 
such matters when he discussed different levels of physiological functioning. 

Much light has been cast in recent years on neurological disorders by considering them in 
terms of cortical disconnexions or dissociations, but the analysis has been, in the main, of 
horizontal or within-cortex disconnexions, or at least between-hemisphere disconnexions. But 
there may be another type of disconnexion, namely a vertical one, that prevents 
acknowledgements to be offered in response to commentary questions but leaves undisturbed 
certain stable routines that are capable of modification through learning and which may be 
displayed, in fact, by lower vertebrates lacking neocortical structures altogether. The possibility 
of applying an interpretation of defects associated with cortical pathology in terms of a vertical 
disconnexion may be rather more widespread than we have realized and might well warrant 
further inquiry, especially as such an inquiry could be fitted into an evolutionary view of cortical 
growth without having to assume that those older pathways that we share with more primitive 
creatures are redundant in ourselves. 

But to return to our main theme and to recapitulate: I have tried to argue that it is still 
reasonable to assume that similar brain structures in monkey and man reflect similar function. In 
some instances discrepancies may occur or be assumed to exist because the monkey has not 
been asked the question in a way that reveals the true limits of its capacity. In other instances, 
and I have dwelt on the two most striking ones known to me, namely disturbances of vision and 
memory, discrepancies between monkey and human research may have arisen because we 
needed new questions to address to the human subjects or, at least, we needed to refrain from 
asking commonplace questions. In the two examples I have chosen, I believe the animal 
research has turned out to have been along the right lines and, in fact, can take some of the 
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credit for allowing a new examination of the human disorders to emerge. If this leads, in turn, to 
further diagnostic power or therapeutic efficacy, the outcome would be highly gratifying. I must, 
before closing, make at least a few disclaimers. J am not suggesting that we have a final theory 
of visual cortex function or of hippocampal function or of related structures. There is no 
shortage of persons trying to fill those gaps, with characteristically intense levels of disputation. 
What I do argue is that it may no longer be necessary to assume that we will need fundamentally 
different theories for man and for monkey. Nor, finally, am I arguing that man is just a monkey 
or even that all monkeys are alike or all men are alike. There are obvious species specific 
properties of a large variety and range, especially in cognitive development and capacity. But, 
given all that, I think it reasonable to consider that these different cognitive repertoires may be 


subject to similar sorts of modulations and that, finally, the study of dissociations between a 
capacity and its acknowledgement may suggest a type of brain organization that is consistent 
both with the engineering approach that Craik, whose memory we honour by this occasion, 
would have fostered and also with one that places older and newer brain structures in a single 


evolutionary framework. 
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Is output order in free recall based on the strength of the memory trace 
or on the subject strategies and depth of coding employed? 


Peter E. Morris 





If output in free recall proceeds from items with strong to those with weak memory traces then Craik’s 
(1970) finding is paradoxical. He reported that probability of recall at a later test increases with output 
position in the first recall attempt. Does recall order depend on depth of encoding and retrieval strategy 
rather than trace strength? Examination of recall order and the serial position of the first item retrieved in a 
conventional free recall experiment supported the hypothesis that the Craik effect occurs because on the 
initial test the items output early are superficially encoded while the last few items are semantically 
processed. By preventing superficial encoding and encouraging semantic processing the Craik effect was 
eliminated. There was no evidence for a trace strength explanation of recall order. 


What determines the order in which items are retrieved in a free recall task? One obvious 
hypothesis is that recall is determined by the strength of the memory trace. Recall begins with 
items with strong traces and proceeds to those with weaker traces. An alternative view that will 
be argued below, is that the order of recall depends upon the depth of encoding which has taken 
place and the strategies for recall adopted by the subject. 

What grounds are there for accepting the trace strength hypothesis? It is consistent with 
the ‘spew’ hypothesis of Underwood & Schulz (1960) which assumes that the likelihood of a 
word being thrown up from memory is directly related to the past experience of that word. 
Experimental evidence to support the trace strength hypothesis comes from Rundus (1974) 
who examined the relationship between overt rehearsal and output order. Rundus had previously 
established that probability of recall (and therefore, presumably, memory strength) is directly 
related to the number of overt rehearsals of the item during learning (Rundus, 1971). Rundus 
now demonstrated that items output early tend to be those most frequently rehearsed and he 
concluded that the items with the stronger memory traces are therefore the first recalled. Further 
support for the strength hypothesis can be found in the results of Murdock & Okada (1970) who 
showed that the interresponse times between the recall of items from a list of unrelated words 
are initially very short, but increase steadily and are directly related to the position in output and 
the number of words which the subject will eventually recall. Subjects appear to begin recall 
with words that are easy to retrieve and proceed to those that require more search time. 

One objection to the trace strength hypothesis is that it is well known that subjects tend to 
recall together items from the same semantic category. Such groups appear to form higher 
memory units (Tulving & Pearlstone, 1966; Tulving & Psotka, 1971; Kellas, Ashcraft, Johnson & 
Needham, 1973). However it would be quite reasonable to expect the trace strength hypothesis 
to hold both for the recall of the higher order units, and, independently, for the items within the 
higher units. 

Practiced subjects tend to begin recall with the last few words from the list. These items, 
commonly believed to be retrieved from a short-term memory, might be regarded as the 
strongest items in memory at the time of recall, but in terms of long-term strength they are not 
strongly registered (Craik, 1970; Madigan & McCabe, 1971). A viable strength hypothesis must 
make an exception of such short-term effects, but for long-term memory the hypothesis would 
predict that order of output reflects trace strength. The trace strength hypothesis predicts an 
inverted U function for the probability of later recall plotted in terms of the position in the initial 
output. The first items, coming from short-term memory should have poor long-term strength. 


448 Peter E. Morris 


The subsequent items, being the first recalled from long-term memory should have the strongest 
traces and be those most likely to be retained. The final items should be those with weak 
memory traces and they should often be lost during the retention interval. 

Craik (1970) reported data on the probability of recall at a later test that is quite incompatible 
with the strength hypothesis. He found that the probability of later recall steadily increased 
across the initial output positions. It was about 0-15 for the first item recalled initially and rose to 
between 0-6 and 1, depending on presentation mode, for the final items from the 15-item list. A 
problem in interpreting Craik’s data is that because subjects differ in the number of items in their 
initial recall those with high recall scores contribute a disproportionately large amount to the last 
few output positions. Vincentized data has been reported by Darley & Murdock (1971) which 
shows a steady increase in the probability of late recall until the penultimate fifth of the initial 
output. The probability of late recall of the last fifth is very slightly lower but the difference was 
not tested by Darley & Murdock and is probably unreliable. However, Rundus, Loftus & 
Atkinson (1970) did find an inverted U in Vincentized data for late recognition. 

Craik’s finding has, therefore, yet to be demonstrated to be reliable. However, if it is a real 
feature of recall then not only is it strong evidence against the trace strength hypothesis but it 
stands as a paradoxical result clearly needing explanation. Why are words which are initially 
difficult to recall so much more permanently registered? The explanation suggested here is that 
the Craik effect is the result of different encoding of the items recalled at the beginning and ends 
of the output sequence, combined with the recall strategies adopted by the subjects. 

Evidence is needed to support the hypothesis just given. It needs to be shown that the first 
items that are recalled on the initial test are more superficially encoded than the later items. It 
also follows from the hypothesis that if superficial encoding is eliminated and semantic encoding 
is encouraged then the Craik effect should disappear. The first experiment reported below sought 
evidence on the nature of coding. The second experiment involved the encouragement of 
semantic and the elimination of superficial encoding. 

In the first experiment the probability of late recall as a function of initial output order was 
calculated to check that the Craik effect was reliable. The position of the first item recalled was 
examined. If type of coding and subject strategies determine recall then we might expect to see 
many subjects trying to begin recall at the beginning of the list. Over trials they may learn to 
switch to first recalling the last few items presented, since they are easily available. Such a 
switch should be accompanied by the change from primacy to recency in the serial position 
curves, reported informally by Murdock (1974). 

Some evidence on the depths of processing adopted may be found in the extent to which items 
are recalled in chunks adopted from the list. Since the adjacent items are unrelated, recall on the 
basis of semantic information should bear little correspondence to the original list order. 
However, items which have been temporarily stored, perhaps being retained by rehearsal, may 
show a greater tendency to be ‘poured’ out in groups which follow the original list order. If the 
first items are superficially encoded while the later items have been semantically processed then 
such recall of chunks of the original list should be more common at the beginning than at the end 
of the recall sequence. Therefore, for items in each recall position the probability that the next 
item recalled had been its succeeding item in the presentation list was calculated. 


Experiment I 

Method 

Subjects. One hundred and fifty-five undergraduates taking the Part I course in psychology at Lancaster 
University acted as subjects during a lecture period. Three subjects failed to hand in all their response 
sheets, so the data reported below are from 152 subjects. 


Materials. Three lists of 15 nouns were selected from the norms of Paivio, Yuille & Madigan (1968) 
controlling for imagery (I), and meaningfulness (m). For the three lists the mean I values were 4-41, 4-45 and 
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4-59 respectively, and the mean m values were 6-07, 6-03 and 6-17. The range of I values for the full set was 
4:03-4:97 and for m was 5-50-7-00. Since I and m have been found to be the most powerful predictors of 
differences between words in free recall tasks (Paivio, 1971) it was hoped that control of these dimensions 
would reduce variability in the recall. 

The lists were typed one word to a line with triple-line spacing. They were made up into booklets with one 
list to a page, blank pages between each list and an additional blank page at the end of the booklet. Equal 
numbers of booklets were prepared with the word list in each of the six possible orders. The six sets of 
booklets were then collated so that when distributed to the subjects equal numbers of each order would be 
used. 


Procedure. Each subject was given a booklet and was told that when instructed they were to turn over the 
first page and read through the items in the list one at a time. A bleep from an amplification system every 

2 sec would be the signal to move on to the next item. As soon as they had finished the list and the bleeps 
ceased, they were to turn the list over and write out as many of the words as they could remember, in any 
order they liked. After 90 sec they would be told to stop and the next list would be run. This procedure was 
followed for the three lists. Then the subjects were asked to write out as many words as they could 
remember from their first two lists. No prior warning of this late test was given. 


Results 


The mean recall for the first, second and third lists learned was 7-1, 6-9 and 7-1 respectively. The 
mean number of words recalled in the late test was 4-8. 


Late recall. To test the Craik effect, the initial recall protocols of each subject for lists 1 and 2 
were divided into three sections, consisting of the first two items recalled, the last two items, 
and the middle items. Because a high level of performance was expected the third list was not 
tested. For the sections the probabilities of the items being recalled again at the late test were 
calculated. The means of these probabilities for the three sections were 0-23 for the first two 
items, 0-35 for the middle items and 0-40 for the final two items. A Friedman two-way analysis 
of variance by ranks conducted on the probabilities gave y? = 33-8, d.f. - 2, P< 0-001. Numenyi 
Tests (Kirk, 1968) between the conditions indicated that the probability of later recall of the first 
two items was significantly lower than the probability of late recall of either the middle or the 
last two items (P< 0-001 in both cases). The likelihood of late recall of the middle and last two 
items did not differ significantly. 

It might be expected that the poor recall of the first two items at the late test resulted from the 
poor recall of those items from the end of the presentation list which are commonly believed to 
be recalled from an ‘echo box’ with very superficial encoding of the items. The probability of 
late recall of the first item recalled at the immediate test was therefore calcualted separately for 
subjects who began immediate recall with the first word from the presentation list, and for those 
who began recall with one of the last six items from the presentation list. These probabilities 
were 0-14 and 0-17 respectively. Clearly, the first word recalled by a subject has a low chance of 
being recalled at the late test, whether recall begins at the beginning or the end of the list. 


First word recalled. The percentage of subjects beginning initial recall with each item in their 
three lists is given in Table 1. It is clear that when recalling their first list many subjects start 
their recall at the beginning of the list. Only a tiny fraction of subjects begin with a word from 
position 2-9. The final items are the starting point for some subjects, with the final item the most 
popular. By the third list there is a shift in the recall pattern. The first item in the list is still the 
mode for the start of recall, but a majority of subjects are now beginning their recall from the 
last five or six items. Items 11 and 12 are the most popular, as subjects run off the last few items 
before attempting to recall words from earlier in the list. That subjects tend to shift from 
beginning recall in the first half of the list to beginning in the second half between trials 1 and 3 
was confirmed by McNemar test (Siegel, 1956) x? = 12-75, d.f. = 1, P< 0-001. 
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Table 1. Percentage of subjects beginning recall with an item from the given serial positions 








Serial 

position l 2 3 4 5 6 7 8 9 10 11 12 13 14 l$ 
List 1 47 1 1 0 I 1 1 2 3 4 il 4 7 15 
List 2 35 3 l 1 1 2 3 9 15 7 7 15 
List 3 29 1 1 2 1 1 1 5 16 18 10 3 10 











Serial 

position 1 2 3 4 5 6 7 8 9 10 11 1 13 14 15 
List 1 Tl 6l 44 34 37 36 49 39 38 32 45 55 47 9 6 
List 2 64 51 40 40 36 35 4 #27 34 21 45 Q 52 Q 75 
List 3 61 51 37 232 33 41 43 35 29 31 48 65 65 601 79 


Final recall 15 19 It 15 2 16 23 17 12 #10 1 21 16 6 18 





The shift in the position of the first word recalled is accompanied by the shift from primacy to 
recency in the serial position curves reported by Murdock (1974). The probability of recall for 
each serial position was calculated for the first, second and third lists learned, and for the final 
recall. The probabilities of recall for the 15 serial positions is shown in Table 2. For each 
successive list learned the primacy effect declines and the recency effect increases. The shift was 
confirmed by binomial tests. Testing for the decline in primacy, 70 subjects recalled more from 
the first five items of list 1 than of list 3, 40 recalled more from the first five items of list 3 than 
list 1. (Binomial test; z = 2-77, P< 0-005.) For the increase in recency, 72 subjects recalled more 
from the last five items of list 3 than list 1, while 40 recalled more from list 1 than list 3. 
(Binomial test; z = 2-93, P< 0-005.) 


Recall order. For the first word recalled by subjects, the probability that the next word recalled 
was the succeeding word in the presentation list was calculated, and so on for the second and 
subsequent words recalled. The result is shown in Fig. 1. With the exception of the sixth recall 
position, all the points are based on a hundred or more observations. The points for the sixth 
position are based on 77+ observations. The graph shows a high probability that initially the 
subject will adopt chunks from the presentation list, with the influence of the original list 
declining steadily as subject’s recall continues. 

It might be thought that the main contribution to the tendency to adopt chunks came from the 
last few items in the list where subjects may be running off the ‘echo box’ before beginning 
recall of earlier items. The probability of the second word recalled being the succeeding word 
from the presentation list was therefore calculated separately for those subjects who began their 
recall with the first word from the presentation list, and those who began recall with an item in 
the second half of the presentation list. For those beginning recall with the first item, the 
probabilities were 0-60, 0-65, and 0-67 for the three successive lists, while for subjects beginning 
recall in the second half of the list the probabilities were 0-72, 0-74, and 0-83. While these figures 
suggest a greater tendency to adopt chunks when recall begins at the end of the presentation list, 
beginning recall with a piece of the original list seems common wherever recall commences. 

Examination of the figure suggests that the extent to which chunks are adopted from the 
presentation list increases with trials. This was confirmed by calculating the number of times 


Output order in free recall 451 







@ List | 
Y List 2 
W List 3 


e 
- 





Probability 
o o o e 
Uu > CA [^ 


d 
nN 


e 


Recall position 


Figure 1. The probability that the next word recalled will be the word which followed in the presentation list, 
plotted for each position in the recall output. 


succeeding pairs of words were recalled in order in the first four output positions for each 
subject on lists 1 and 3. For 74 subjects more recall sequences occurred in list 3 than list 1, for 
28 subjects more sequences occurred in list 1 than list 3. Binomial test; z= 4-46, P < 0-001. 


Discussion 


The first few words recalled, on the initial test, whether from the beginning or the end of the 
presentation list, are poorly recalled at the later test. The last few words recalled on the initial 
test do not show the loss at the later test that would be expected on the trace strength 
hypothesis. The results, therefore, neither support the strength hypothesis nor consistently 
match those reported by Craik (1970). 

While the strength hypothesis does not predict the results, there is evidence for the alternative 
hypothesis that the control processes adopted by subjects influence the output order. So many 
subjects begin their recall with the first item from the list, and so few with items 2-9 that it is 
hard to believe an explanation other than that subjects try to begin at the beginning, at least on 
their first list. The shift in the position of the first item recalled is also consistent with a strategy 
view. Subjects shift to a strategy of beginning recall from the end of the presentation list, 
perhaps to avoid the rehearsal necessary to maintain the early items in memory. However, they 
do not begin with the last word from the list, as the strength hypothesis might predict, but rather 
output in order the last few items starting with the 11th or 12th item. 

The first few items tend to be recalled in groups from the original list. It is as if subjects have 
temporarily acquired, and retained at a superficial level of encoding a part of the list which they 
simply ‘pour out’. 

The items recalled early during the initial recall seem to be retained at a superficial level of 
encoding. The later items are not recalled in groups from the presentation list. This is as would 
be expected if the later items retrieved are coded semantically and retrieved through semantic 
cues. 
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The first item recalled is often the first item from the presentation list. Its very low probability 
of later recall suggests that it has been only superficially encoded. It has been retained until the 
end of the list, and, as Rundus (1971) has shown, the early items in the presentation list are 
those most frequently rehearsed. It is therefore likely that the retention of the early items from 
the presentation list at a superficial level of coding is achieved through frequent rehearsal. The 
poor long-term retention of these items suggest that the rehearsal has not led to more permanent 
storage. This may be a little more support for the growing body of evidence, reviewed by 
Postman (1975) that rehearsal can simply be used to maintain items in memory with no 
improvement in long-term retention. 

When the distinction between maintenance rehearsal and rehearsal that is constructive in 
forming a long-term trace is recognized the apparent discrepancy between the present results and 
those of Rundus (1974) becomes explicable. Rundus did not distinguish between these two sorts 
of rehearsal. It would be very difficult to do so for overt rehearsal. Perhaps the pauses between 
items might be used as an indicator, since items undergoing maintenance rehearsal can be 
quickly run through while constructive rehearsal will require pauses between items while 
processing and analysis takes place. If items that have been retained using maintenance rehearsal 
are output first, and since, presumably, items so retained require several rehearsals during the 
list learning, then the earlier items recalled should be associated with several overt rehearsals. 
Items recalled towards the end of the recall come, it has been claimed above, from semantically 
based storage, and since they have not been output early in recall, will tend to be items that 
have not been the subject of maintenance rehearsal. They will have received constructive 
rehearsal, but that does not involve frequent repetition. In fact such repetition would be 
disadvantageous since it would occupy processing capacity required for semantic encoding. The 
later output items should tend, therefore, to be associated with fewer overt rehearsals than the 
earlier output items, as Rundus found. The relationship between output order and number of 
rehearsals is a function of the encoding strategies adopted, and not the strength of trace formed 
by the rehearsals. It is possible that the relationship between number of rehearsals and 
probability of recall is less dramatic if the immediate test of recall is not involved, and the 
influence of maintenance rehearsal is removed. 


Experiment II 


It was suggested above that the Craik effect occurs because the first few items that are recalled 
are retrieved from a superficial level of coding while later recall is based upon semantic 
encoding. From this it follows that if superficial encoding can be discouraged and its effects 
eliminated then the Craik effect should disappear. 

In the second experiment two conditions were used to encourage semantic encoding. On the 
first, third and fifth lists the subjects were required to classify each word as representing a living 
or a non-living thing. The second and fourth lists were composed of items drawn from four 
categories and it was hoped that this would lead to deeper processing. The use of superficial 
encoding was discouraged by requiring subjects to remember and write down two eight-digit 
numbers before beginning to recall the lists. 

It has been predicted that the items output early on the initial recall test should no longer show 
poorer long-term retention than the other items. Can further predictions be made about the 
relationship to be expected between initial output order and probability of late recall? The 
strength hypothesis can be tested again. Now, with superficial encoding removed, the strength 
hypothesis predicts that the probability of long-term retention will decline over initial output 
positions. Another alternative hypothesis is that the Craik effect occurs not because the early 
items are superficially encoded but because the last few items initially recalled are better learned 
as a result of the subject's activities during the first attempt at recall. The effort involved in 
recalling the last few items may tag them emotionally, or lead to the retention of the appropriate 
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recall cue. The subject may simply spend more time during the recall phase looking at the last 
few items and repeating them as recall cues as he struggles to retrieve more from the list. On 
this view, the elimination of superficial encoding should not eliminate the Craik effect. The 
strength hypothesis now predicts that the Craik effect will be reversed. The learning at initial 
recall hypothesis predicts that the Craik effect will still occur. The depth of coding hypothesis 
suggests that there should be no relationship between initial output order and later recall. In 
addition to examining the probability of late recall, the serial position of the first item recalled 
and the probability of the second item retrieved being the succeeding item from the presentation 
list were examined. The pattern of these should show that recall was more dependent upon 
semantic than upon superficial encoding. 


Method 


Subjects. Twenty unpaid volunteers acted as subjects, the majority being undergraduates taking the Part II 
Psychology course at Lancaster University. One subject recalled too few items for analysis and was 
replaced. 


Materials. For the classification task, three lists of 15 items were composed by selecting the first item from 
the Battig & Montague (1969) category norms for 45 categories. Within each list either seven or eight of the 
items were classifiable as living things or parts of living things while the remaining words refer to inanimate 
objects. For the categorized lists two lists of 16 items were composed by selecting the fifth to eighth items 
from the Battig & Montague category norms for vegetables, sports, clothing, and alcoholic drinks, to form 
one list, and similarly from the animals, fruits, parts of the human body and metals categories to form the 
other. For each list, the order of the words was randomized and they were typed one word to a line with 
triple-line spacing. They were made up into sets, with the classification lists as the first, third and fifth list 
and the categorized lists as the second and fourth lists. The actual lists in these positions were randomized 
for the different subjects. 


Procedure. Subjects were tested either individually or in small groups. Each subject received a set of lists 
and five small and one large sheet of paper. They listened to a bleep from a tape-recorder which sounded 
every three seconds, and were told that when the bleeps were started again they were to turn over the first 
list of words, and go through it in time with the bleeps putting a tick against each item that referred to a 
living thing, or part of a living thing, and a cross against items which represented a non-living thing. When 
they had completed the list they were to turn it over. An eight-digit number would then be read which they 
must listen to and write down. A further eight-digit number would then be read. After they had written it 
down they were to write out as many of the words as they could remember, in any order liked. The same 
procedure would be followed for the other four lists, except that no ticking and crossing would be required 
for the second and fourth lists. At the end, they would be asked to recall as many words as they could 
remember from all five lists. No time limit was put upon the recall phase, but subjects proceeded to the next 
list when it was clear that they could remember no more. 


Results 


Late recall. The results for the classification task and the categorized lists were analysed 
separately. Because a high level of performance was expected, the last list was not scored. The 
initial recall protocols for the first four lists were divided into three sections, consisting of 

the first two items recalled, the last two items and the middle items. For these sections the 
probabilities of the items being recalled again at the late test were calculated. For the 
classification task the mean probabilities for the beginning, middle and end sections were 0-53, 
0-50 and 0-61 respectively. For the categorized lists the mean probabilities were 0-72, 0-73 

and 0-67. Differences between the sections, tested by Friedman two-way analyses of variance 
by ranks, were not significant. For the classification task y? = 3-22, d.f. 2 2, P= 0-2. For the 
categorized lists y = 1-30, d.f. — 2, P» 0-5. Such difference as there is for the classification 
task comes from slightly better recall of the last two items, but too much should not be made 
of this since the trend is reversed for the categorized lists. 
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Table 3. Number of occasions recall began with an item in the given serial position 





Serial 

position 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 
Classifying 7 6 4 7 0 2 7 3 0 5 4 3 7 2 3 

task (12) 40 (0 (12) 0 Q 200 © (8690 0 5 2 9 (5 
Categorized 7 8 4 3 4 4 l 1 I 0 2 2 I 1 I 0 
list (17) Q0 (10) (8 (10) 00 9 00 0000 90 (0 





Note. Numbers in parentheses indicate the percentage of occasions recall began with that serial position. 


First word recalled. The distributions of the first word recalled across the serial positions of the 
presentation lists are shown in Table 3. Proportions are given in brackets. Examination of the 
table suggests that the distributions differ from one another, and from the distributions found for 
lists | and 3 in Expt. I. Tests on the distributions were carried out using the Kolmogorov- 
Smirnov two-sample test (Siegel, 1956). It should, however, be noted that the tests involve two 
assumptions which could be questioned. These are (a) that in Expt. II there is no interaction 
between lists and subjects, so that data from all lists and subjects can be pooled and (b) that 
the last item in the categorized list, which was never the first word recalled, can be ignored for 
the purpose of the tests so that the list lengths are equated. Given these assumptions, the tests 
indicate that the distributions are significantly different from each other and from those in 
Expt. I (P< 0-025, two-tailed tests). 


Recall order. For the classification task the probability that the second word recalled was the 
next word in the presentation list was 0-25, which, while greater than chance, is far smaller than 
that found in Expt. I. For the categorized lists the probability was 0-1. The recall of the 
categorized lists clearly reflected the influence of semantic factors. In 28 of the 40 cases the 
second word recalled came from the same category as the first. In 22 of these cases the recalled 
word was from later in the presentation list. Adjusted ratios of clustering (ARC; Roenker, 
Thompson & Brown, 1971) were calculated for each subject. The ARC measure has a value of 1 
if all the words from each category that are recalled are recalled together, and a value of 0 for 
chance levels of clustering. The mean ARC values for lists 2 and 4 were 0-65 and 0-70 
respectively. On list 4, 13 subjects had an ARC value of 1. 

Since recall is organized to such an extent on the basis cf the semantic properties of the items 
it is clear that many items had received semantic processing. 


Discussion 


Several features of the results indicate that superficial encocing was successfully either 
prevented, or cleared by the procedures adopted. The strong tendency found in Expt. I to begin 
recall with several items in the order of the presentation list was not repeated, and the pattern of 
the position of the first item recalled was different. In Expt. I recall usually began with the first 
or llth or 12th items and it was suggested that chunks of the list had been maintained by 
rehearsal and were output first. Now, with rehearsal effects eliminated the position of the first 
item recalled varies greatly. For the categorized lists it may be that subjects attempt to begin 
recall near the beginning of the list, and then output items rom the same category as the first 
recalled item. The probabilities of late recall, given initial recall, were much higher than in the 
first experiment. Of course, a proper comparison cannot be made since, while in Expt. II more 
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lists intervened between initial and late tests, subjects also had longer to learn the items. Even 
so, the high level of late recall is in line with the findings of Mazuryk & Lockhart (1974) that 
semantic encoding leads to fewer items being recalled initially, but to better long-term retention. 
The ARC data shows that the recall of the categorized lists closely reflected those categories. 

In the present experiment, with superficial encoding eliminated, the Craik effect does not 
occur. In fact there is little sign of output order being related to likelihood of late recall. Neither 
the strength hypothesis nor the learning-at-initial-recall hypothesis are supported. 

However, the hypothesis that the Craik effect occurs in conventional free recall because items 
that are superficially encoded are output first is upheld by the results. It now appears that the 
Craik effect is not as paradoxical as it appeared. The steady improvement in the likelihood of 
late recall over initial output positions found by Craik (1970) has not been replicated when 
account has been taken of the amount which each subject recalls. It remains true that the first 
items output initially in a conventional free recall task have a lower probability of late recall than 
do the succeeding items. However, this appears to result from the use of superficial encoding for 


these items. 
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Some remarks on the relationship between initial output order 
and subsequent performance in free recall 


John M. Gardiner and Peter Herriot 


Craik (1970) found that when after a series of immediate free recall trials subjects were asked to 
recall any items they could from all previous lists, thé probability of recalling an item in this final 
test increased dramatically the later it had been output in the initial recall trial. Poor subsequent 
recall of items that had been among the first to be recalled on any of the initial tests would be 
expected either on the grounds that such items had been retrieved largely from primary memory 
(Waugh & Norman, 1965) or because they had been encoded at only a shallow level of 
processing (Craik & Lockhart, 1972). More surprising was the finding that the function relating 
initial output order to subsequent recall performance continued to rise through the later output 
positions. Items recalled towards the end of any one test trial may be thought of as having been 
retrieved from secondary memory or as having been encoded to a deeper level of processing. It 
has been suggested (Rundus, 1971) that for such items recall order proceeds from stronger to 
weaker item traces, a position which is reminiscent of the spew hypothesis (Underwood & 
Schulz, 1960). If this notion is correct then the function relating initial output order to 
subsequent recall should decline over the later output positions. The main purpose of this note is 
to comment on some methodological problems involved in determining this function, particularly 
in the light of the results Morris reports elsewhere in this Journal. 

First, however, we describe further data on this output order effect which, though not 
reported at the time, were collected during an earlier study (Gardiner, Thompson & Maskarinec, 
1974; Experiment I). In that study 25 subjects each received a series of ten different 12-word 
lists for free recall following a period of 30 sec rehearsal-preventing activity. In this paradigm 
there is good evidence that all items may be retrieved from secondary memory or encoded at a 
deeper level (e.g. Glanzer & Cunitz, 1966). After the last recall test, the subjects were then 
asked to recall any items they could from all previous lists. In the present context the result of 
interest concerns the function relating initial output position to recall probability in the final test. 
Because not all subjects contribute equally to the later output positions, this function was 
calculated using Vincentization procedures (Hilgard, 1938). Each subject's output protocol, for 
each list, was divided approximately into quartiles and final recall probabilities, conditionalized 
on each quartile, were averaged over all lists and subjects. From the first to the fourth quartile 
respectively these probabilities were 0-56, 0-58, 0-53 and 0-54. Each of these values is based on 
between 375 and 383 observations. Clearly the function obtained here is essentially flat. This 
outcome offers little support either to the original (non-Vincentized) effect reported by Craik 
(1970) or to the notion that order of recall in this situation proceeds from stronger to weaker item 
traces (Rundus, 1971). 

The function reported by Morris seems very similar to that observed here, and both these 
results contrast with those obtained by Darley & Murdock (1971) and by Rundus, Loftus & 
Atkinson (1970). In those earlier studies, Vincentization procedures had been used to examine 
the output order function following immediate free recall. Over the later output positions, Darley 
& Murdock found that the function increased, whereas Rundus et al. found that it declined: in 
neither study did the trend appear to be particularly dramatic. 

The most likely explanation of the discrepant pattern of results over these several studies may 
lie in certain problems associated with the procedures used to calculate the output order 
function. Vincentization corrects for a possible subject-selection artifact in that not all subjects 
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contribute data to the later output positions. The data from Vincentized functions indicate that 
the original effect reported by Craik was partly attributable to that artifact. Vincentization, 
however, introduces an item-selection problem. An illustration may make this clear. Dividing 
output protocols into quartiles can lead to equating the fourth item in a four-item protocol with 
the seventh and eighth in an eight-item one. From the point of view of the notion that recall 
order proceeds from stronger to weaker item traces, the implication of this procedure is that such 
items are more or less equivalent in their trace strength. To che extent that this assumption is 
invalid, then over later output positions Vincentization involves collapsing weaker and stronger 
items into the same category. The procedure used by Morris, whereby the output order function 
is calculated on the basis of the first two, middle two, and last two items to be recalled, does not 
avoid this problem and in that respect his procedure is similar to Vincentization. 

That point aside, interpretations of the output order functicn which are couched solely, or even 
largely, in terms of registration factors such as trace strength or encoding depth, seem likely to 
prove inadequate. This is because such explanations do not take sufficient account of the 
possibility that the initial act of recall may alter the state of the memory trace. As a consequence 
of recall, for example, information derived from response-produced feedback becomes available 
to the subject. Some unpublished data collected recently in cur laboratory by Cathy Passmore 
provide evidence that such information may affect subsequent performance. In a test of memory 
for remembered events (Gardiner & Klee, 1976), her subjects had poorer knowledge of their 
previous oral recall performance when the auditory feedback normally available had been 
impaired. It is also possible, as indeed Craik (1970) had suggested, that items retrieved late 
during any test trial are retrieved with greater difficulty, and that processes related to this more 
difficult retrieval might enhance the relative accessibility of those items subsequently. Morris 
considers this hypothesis, but testing it involves exactly the same kind of problem as we 
discussed in connection with the trace strength prediction. To use the same illustration from the 
point of view of retrieval difficulty, Vincentization can involve treating the fourth item in a 
four-item protocol as being more or less as easy or as difficult to retrieve as the seventh and 
eighth item in an eight-item one. Again, we sutton the validity of this implicit assumption. It is 
worth remarking that there is evidence in support of the retrieval difficulty hypothesis, though in a 
rather different paradigm (Gardiner, Craik & Bleasdale, 1973). In considering the relationship 
between initial output order and subsequent performance in free recall, however, we suggest that 
no wholly satisfactory method of testing that hypothesis, or any other, has yet been found. 
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On testing the strength of trace hypothesis for output order in free recall: 
A reply to Gardiner and Herriot 


Peter E. Morris 


Gardiner & Herriot's (1977) paper provides further evidence to support my contention that once 
superficial encoding has been eliminated there is no relationship between initial recall order and 
the probability of items being retrieved at a later test. However, these data, like my own, 
depend upon a Vincentizing procedure, and the main purpose of Gardiner & Herriot's paper is to 
indicate what they consider to be the dubious assumptions of such methods. 

They are right to highlight the fact that assumptions are implicit in the use of Vincentization 
and that these can sometimes lead to a misinterpretation of results. However, I think that they 
are wrong when they conclude that these assumptions make the use of Vincentization in testing 
the strength hypothesis for output order in free recall invalid. They give an example where 
output protocols are divided into quartiles and say that, where one subject has recalled four items 
and another eight, this procedure ‘can lead to equating the fourth item in a four-item protocol 
with the seventh and eighth in an eight-item one'. This is certainly the hazard of Vincentization. 
My disagreement is with their following sentence when they state that ‘From the point of view 
of the notion that recall order proceeds from stronger to weaker item traces, the implication is 
that such items are more or less equivalent in their strength trace'. Such an assumption is not 
necessary. 

It is not necessary to assume that the items in the fourth quartile of all subjects are 
comparable. The necessary assumption is that, for each subject, the probability of later recall 
declines as one proceeds from the first through to the fourth quartile. Comparisons take place 
within subjects, and, in my report, were tested within subjects. The assumption of an ordinal 
decline in the probability of later recall across the Vincentized sections is all that is necessary to 
test the hypothesis that recall proceeds from stronger to weaker item traces. 

Gardiner & Herriot's second criticism is that explanations of output order must take into 
account not only encoding depth but also alterations in the memory trace that result from the 
initial act of recall. A first, though rather peripheral, reply is that the act of recall is involved in 
depth of encoding explanations, in that the recall of items only superficially encoded leads to 
those items receiving no deeper coding, and being unavailable for subsequent recall. However, 
Gardiner & Herriot's main point is that the act of retrieval of deeper processed items will 
increase the likelihood of their subsequent recall. They suggest that the difficulty in recalling 
items that are remembered at the end of the initial recall test may lead to these items having a 
greater enhancement of their accessibility than items retrieved earlier. 

There is no doubt that the act of recall improves the likelihood of the item being recalled later. 
However, the important question is whether the difficulty encountered in recalling the last few 
words at the initial test especially aids subsequent recall. Anyone wanting to save the strength of 
trace hypothesis by using this as an explanation of the lack of relationship between initial recall 
order and the probability of later recall would have to assume that the improvement in strength 
of trace through the difficulty of retrieval exactly balanced the decline in strength of trace 
formed at the initial presentation. Such a balancing would appear to have occurred at least four 
times (three times in the experiments that I report, and once in the data given by Gardiner & 
Herriot) under widely differing conditions. The proponent of this hypothesis is forced to 
maintain that the effect of difficulty of retrieval almost always balances the initial strength of 
trace. An hypothesis which proposes the balancing of two processes so that no difference is 
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found is difficult to test, and, therefore, difficult to accept. The alternative, that neither process 
occurs, is more parsimonious. 

If no combination of processes is postulated, then the conclusion to be drawn from the four 
demonstrations of no relationship between output order and probability of later recall is that the 
difficulty of retrieving the items at the initial test does not differentially enhance the likelihood of 
their later recall. If difficulty of initial retrieval had an effect, then the last few items from the 
initial recall protocol would be the best recalled at the second test. One must either postulate 
some process that exactly compensates for the improvement through recall difficulty, or accept 
that little such improvement takes place. 

For reasons similar to those given earlier, I do not accept Gardiner & Herriot’s rejection of 
Vincentized data as being relevant to the testing of the improvement through recall hypothesis. 
Improvement in recall at the second test should show itself in the Vincentized protocols of each 
subject. Subjects should tend to have difficulty in recalling their last few items at the initial test, 
however many items they recall, and should therefore tend to recall those final items more 
frequently on the second test. 

While I reject the specific criticisms of Gardiner & Herriot I agree with the underlying 
principles which I assume to have motivated their comments. Vincentization can be a misleading 
technique. More importantly, discussion of simple hypotheses such as the strength trace 
hypothesis can suggest an oversimple model of the memory system. One object of my earlier 
paper was to illustrate the complexity of the memory processes. I hope that it will help to bring 
into question such rather naive concepts as strength of trace so that they may be replaced by a 
richer and more detailed analysis of the processes that underlie memory. 
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Final remarks 


Peter Herriot and John M. Gardiner 


Morris's reply appears to fail to answer our earlier critique. Firstly, the objection to the 
Vincentization procedure still applies when directed to its application within subjects as well as 
between subjects. This is because subjects’ performance was averaged over lists and clearly, 
subjects will have recalled different lists with differential success. Secondly, nowhere did we 
suggest that the flat recall function was due to a balancing of a trace strength effect and 

a retrieval difficulty effect. We merely suggested that the objections to the Vincentization 
procedure were valid whichever of these two theoretical possibilities was being entertained. 
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The effects of within-sequence acoustic similarity on the short-term 
retention of consonants and words 


D. Marcer, W. A. Matthews and G. Dring 


One of the earliest demonstrations that forgetting of sub-span units occurs over brief, filled 
retention intervals was reported by Peterson & Peterson (1959). Subsequent work has shown that 
short-term loss is dependent upon both the number of previously learned trigrams (Keppel & 
Underwood, 1962), and the number of distractor items (Waugh & Norman, 1965). Such findings 
led Melton (1963) to propose a unitary theory of memory, with proactive (PI) and retroactive 
interference (RI) being major sources of forgetting. This theory predicts little short-term 
forgetting if both the PI and RI are absent. PI can be eliminated by testing each subject only 
once, or by using a longer inter-stimulus interval (Loess & Waugh, 1967), while RI can be 
minimized by using distractors which are conceptually dissimilar to the stimulus (Corman & 
Wickens, 1968). Experiments employing this methodology have not supported the unitary theory. 
Baddeley & Scott (1971) reported a significant decline in the recall of digit sequences over a 6 
sec interval, as did Marcer (1972) using a five-consonant sequence. To maintain a unitary model 
it is necessary to postulate a third source of interference, intra-unit interference (II). II refers to 
interference between the items within a stimulus, and was proposed as a factor in STM by 
Melton (1963). Melton showed that the rate of short-term forgetting increases with the number of 
items in the stimulus. Unfortunately as all stimuli, regardless of length, received identical 
presentation intervals, the different rates of forgetting may have reflected differences in original 
levels of learning rather than the effects of II. With immediate recall equated, Houston (1965) 
compared the short-term forgetting of five- and six-word sequences. More forgetting of the 
six-word sequence occurred over the first 1-5 sec, with parallel rates thereafter. Similar results 
were obtained by Baddeley & Scott (1971). The equivocal results of these two experiments 
suggests that within sequence similarity, rather than stimulus length might be a more appropriate 
independent variable. Interference theory would predict an increase in the rate of forgetting as 
within-sequence similarity increases. This hypothesis was tested here by comparing the rates of 
forgetting of five-item sequences of acoustically similar and dissimilar consonants and words in 
the absence of RI and PI. 


Experiment I 
The first experiment measured the minimum presentation interval necessary to equate immediate 
recall of the similar and dissimilar sequences. 


Method 


Consonant material. Two sets of five-consonant sequences were derived from the arrays BCDGPTV and 
BFHKQR. Each of 20 subjects undertook five recall trials, ten subjects receiving three trials involving 
similar sequences and two involving dissimilar sequences, and vice-versa. Five presentation intervals were 
employed, 800-1600 msec, in 200 msec steps, presentation interval order being random across subjects and 
sequences. Presentation was tachistoscopic, with spoken, ordered recall, a 5 min inter-trial interval and a 100 
per cent recall criterion. 


Word material. The method was similar to that just described except that each subject received ten trials and 
the immediate recall criterion was 80 per cent rather than 100 per cent. The word sequences were those 
employed by Marcer (1975). An example of a similar sequence (AS) is: tap, pan, fat, fan, pat; and a dissimilar 
sequence (AD): met, nip, cog, gap, bus. 
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Results : 


The minimum presentation intervals, consonants, AS = 1-40 sec, AD = 1-20 sec; words 
AS = 2-80 sec, AD = 1-30 sec. 


Experiment II 
The second experiment measured the rates of forgetting of AS and AD materials. 


Method 


Consonant material. Eighty naive undergraduates were randomly assigned to eight equal groups, and given 
training at subtraction in steps of three from a number spoken by the subject, which served as the 
interpolated activity. Each then performed a single recall trial of the tachistoscopically presented sequence 
(either PDTCB or RQFBH). Spoken ordered recall was required after 0, 2, 4 or 6 sec. 


Word material. This differed from that just described as follows. Thirteen subjects were tested on all ten 
stimuli in a counterbalanced design, PI being minimized by using a 2 min inter-trial interval. Five retention 
intervals 0, 2, 4, 8 and 16 sec were employed. 


Results 


The mean percentages of items recalled in correct order are shown in Table 1. For consonants 
and words there was a retention interval effect (F = 29-8, d.f. 3, 72, P< 0-001; F= 12-94, d.f. 4, 
48, P<0-001, respectively). In neither case was confusibility, nor its interaction with retention 
interval significant (F« 1 in both cases). The fact that immediate recall of words was less than 80 
per cent is probably due to different subjects being used in each experiment. 


Table 1. Mean percentage recall of AS and AD items 
Retention interval (sec) 


Type of material 0 2 4 6 8 16 








AS consonants 100-0 58-0 58-0 36-0 — — 
AD consonants 100-0 60-0 64-0 44-0 — — 
AS words 64-5 51-0 40-0 — 21-5 18-5 
AD words 64-5 55.5 46-0 — 23-0 26-0 
Discussion 


These results do not support a unitary theory of memory. Both sets of AD sequences showed 
substantial forgetting despite the minimal levels of PI, RI and II. The slight (and statistically 
insignificant) extra forgetting of the AS sequences, together ‘with the absence of a similarity by 
retention interval interaction, indicates that II is at the most a relatively unimportant factor in 
short term forgetting, and a trace decay interpretation seems the most plausible. The similar 
rates of forgetting of the AS and AD sequences imply that the acoustic confusion phenomenon 
is not due to acoustically similar traces decaying more rapidly. This supports an earlier finding 
by Baddeley (1968) who equated immediate recall by varying sequence length rather than 
presentation interval. It is also of interest that in order to equate immediate recall of the AS 
word sequences with that of the AD sequences, an increase in presentation time of 115 per cent 
was required. This compares with a value of only 12 per cert when subjects were asked to read 
these sequences (Marcer, 1972); and suggests that the acoustic confusion effect is not due solely 
to differences in the speed at which the different materials are scanned at input. Taken together, 
these findings support Baddeley's view that a substantial part of the cause of the acoustic 
confusion phenomenon is located at the retrieval stage. 
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A note on voluntary control of heart-rate and short-term memory 


Alena Cerna and J. M. Wilding 


This experiment investigated whether voluntarily induced variation in an autonomic function 
which is assumed to reflect arousal was accompanied by changes in a cognitive function like 
those normally associated with changes in arousal. Short-term memory performance was tested 
following a period of voluntary acceleration or deceleration of heart-rate. 

STM has been shown to be affected by changes in arousal induced by emotionality of words or 
white noise, and to be related to variations in arousal occurring during the day; electrodermal 
activity and body temperature have been used as measures of arousal in some cases. While a 
number of qualifications must be made concerning these results, there is considerable support for 
a negative (or possibly inverted-U shaped) relation between arousal and STM and a positive 
relation between arousal and LTM (Uehling, 1972; Craik & Blankstein, 1975). 

Heart-rate does not reflect a single arousal system; it increases during mental arithmetic and 
decreases during attention to external stimuli (Lacey, Kagan, Lacey & Moss, 1963; Lacey & 
Lacey, 1970). The present experiment tests whether altering an internal state which affects 
heart-rate also affects mechanisms involved in STM; since different methods of altering heart- 
rate may have different effects, the results may be specific to the method used, which was to 
suggest that subjects thought about arousing or calming situations. 

On the basis of previous findings on the relation between arousal and STM and of the finding 
that reduced heart-rate is associated with attention to external stimuli, it was predicted that STM 
performance would be superior when heart-rate was lower. 

Ten male and ten female subjects, aged between 20 and 32, learned to vary heart-rate with 
the aid of visual feedback (with the option also of auditory ‘bleeps’ in time with heart beats). A 
digital counter fed from a biological amplifier, recording from electrodes on the wrists and back 
of the left hand, displayed the number of beats recorded during 20 sec for the next 20 sec period. 
These numbers were recorded. 

Subjects relaxed for 10 min, having been asked to sit quietly, breathe normally and try not to 
move, then a base-line heart-rate was recorded for 5 min; after 2 min rest subjects were asked 
(increased condition), ‘could you please visualize or think about a stressful situation, for 
example a traffic jam, coming examinations, earthquake or anything which may increase your 
heart-rate’, or (decrease condition), ‘Could you please visualize or think of a peaceful scene, 
music which may relax you, or if you wish to meditate you can do so’. These conditions were 
given in random order, each for 10 min. Mean rates for the base-line, increase and decrease 
conditions were 70-55, 76-88, 67-88 beats per minute (beats/min) respectively. By analysis of 
variance, testing the effect of instruction against its interaction with subjects, the increase and 
decrease conditions differed significantly (F= 23-21, d.f. = 1, 72, P< 0-001). There were no 
significant sex differences; the effect of time intervals was just significant (P « 0-05). 

In the test session, which took place at least 24 hours later, conditions were the same except 
that no auditory feedback was provided. A base-line was recorded for 10 min, then five male and 
five female subjects were assigned at random to an increase and the rest to a decrease condition 
for 5 min. Then a tape-recording instructed them to listen to 20 words of low emotionality (rated 
in a previous experiment), one every 2 sec, and recall them in any order. Mean rates for the 
base-line and experimental conditions were 65-3 and 71-6 beats/min (increase) and 68-6 and 63-3 
beats/min (decrease). The effect of instruction was represented in the analysis of variance by the 
significant condition xinstruction interaction (F =13-77, d.f. = 2, 32, P< 0-001). Time intervals 
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had no significant effects, but males showed a significantly zreater effect of instruction; male 
heart-rates in the increase and decrease conditions were 74-0 and 60-2 beats/min and female 
rates were 69-2 and 65-8 beats/min. 

The increase group recalled a mean of 7:5 words and the decrease group a mean of 10-7 words 
(F= 16-68, d.f. 2 1, 17, P« 0-001). There were no sex differences. The prediction was therefore 
confirmed. 

It had been intended to record heart-rate during the memory test, but interference from the 
tape-recorder and subjects’ movements spoiled the record, so it was impossible to tell whether 
differences persisted during the test which may have been related to performance. As already 
pointed out, the result may be specific to the method of inducing arousal; equally it may be 
specific to the learning material. For example the result may reflect differences in set produced 
by the different instructions for the increase and decrease conditions and subjects in the increase 
condition might show superior performance if the words to be learned were related to the 
stressful situations they were asked to think about. Further work is therefore needed to clarify 
interpretation of the result and also to examine long-term retention. 
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Familial sinistrality and degree of left-handedness 


Walter F. McKeever and Allan D. VanDeventer 


Hécaen & Sauguet (1970) reported that, according to a handedness inventory measure, familial sinistrality 
among left-handers (brain-damaged patients) was associated with weak left-handedness. We assessed the 
relationship of familial sinistrality and degree of left-handedness among 71 normal left- and 80 right-handed 
subjects. No general relationship of degree of left-handedness, defined by four handedness tasks, to familial 
sinistrality obtained. Only one of the tasks (finger tapping speed) significantly differed between familial and 
non-familial left-handers, the familial left-handers being more strongly left-handed on the task. 


Genetic theories of left-handedness typically assume that familial sinistrality is positively related 
to degree of left-handedness (Trankell, 1955; Annett, 1964, 1975). Hécaen & Sauguet (1970), 
however, in a study of 49 unilaterally brain-damaged patients, found that familial left-handers 
were the weakly left-handed according to a handedness inventory. That finding is directly 
contradictory to the one gene-two allele, Mendelian mode hypotheses of Trankell and of Annett 
(1964 model) as well as to Annett’s most recent model (see Annett, 1975, p. 325). These theories 
also suggest that familial sinistrality (FS) in right-handers would be associated with decreased 
right-hand skill and preference (greater manual ambilaterality) than in right-handers lacking 
left-handed relatives, since they assume partial penetrance of the recessive allele in 
heterozygotes (Trankell, 1955; Annett, 1964) or a greater likelihood of the absence of the 'right 
shift' factor (Annett, 1975). 

No systematic assessment of the relationship of FS to degree of hand preference and manual 
skill superiority has been undertaken with normal subjects. The closest approach to a 
demonstration of a FS-degree of handedness relationship has come from Annett's efforts to show 
that differing percentages of ‘right-handers’, ‘mixed-handers’, and ‘left-handers’ (as defined by 
her inventory) issue from right-right, mixed—mixed, and left-left parental pairs. Those data, 
particularly in her 1972 (Annett, 1972) paper, tend to show some inheritance of degree of left- 
handedness. At the same time, those data, which show a rough correlation of degree of left- 
handedness in parents and children where both had filled out the Annett Handedness Inventory, 
suffer from the fact that the assessment of left-handedness in relatives was restricted to the 
parent-child instances and that bias, in the direction of enhanced parent-child handedness 
inventory agreement, could have resulted from the parents and children in some instances filling 
out the inventory together in their homes. 

Knowledge regarding the relationship between FS and degree of left-handedness could also be 
helpful in resolving some of the inconsistency in the literature concerning the dichotic listening 
performances of left-handers. For example, Zurif & Bryden (1969) found only those left-handers 
with a positive familial sinistrality (FS+) to show equal recall from both ears while those with a 
negative history of familial left-handedness (FS—) showed right ear superiority comparable to 
right-handers. Dee (1971) found weakly left-handed sinistrals to lack right ear superiority, while 
Knox & Boone (1970) and Satz, Achenbach & Fennell (1967) found strongly left-handed subjects 
the ones lacking right ear superiority. If FS and degree of left-handedness are positively related, 
Zurif & Bryden's findings would support the Knox & Boon and the Satz ef al. (1967) results; if 
FS is negatively related to degree of left-handedness, then the Zurif & Bryden finding would be 
consistent with that of Dee (1971). 

We assessed this question in 71 self-avowed left-handed and 80 self-avowed right-handed 
college students classified for FS. The FS+ designations were assigned to those having at least 
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one first-degree or at least two second-degree left-handed relatives. This definition is virtually 
identical to that of Hécaen & Sauguet except that they included left-handed cousins among the 
relatives contributing to FS+ designations and we did not. Complete FS data were secured, with 
the aid of the parents of the subjects, for all grandparents and all biologically related aunts and 
uncles of the subjects on whom we report. There were a tozal of 37 FS+ left-handers (14 male, 
23 female), 34 FS— left-handers (15 male, 19 female), 36 FS+ right-handers (14 male, 22 
female), and 44 FS- right-handers (17 male, 27 female). 

Strength of handedness was assessed on the following tasks: The Tapping (Finger Oscillation) 
Test (Reitan, 1959), The Grooved Pegboard Test (Klove, 1963), dynamometer grip strength, and 
The Edinburgh Handedness Inventory (Oldfield, 1971). The measures derived from these tasks, 
in order, were: number of taps achieved by each hand, summed over two 10 sec trial periods; 
number of seconds required for each hand to insert all pegs into the pegboard; mean grip strength 
(kg) per trial, averaged from three trials for each hand; and number of ‘checks’ (see Oldfield, 
1971) for each hand for the ten unimanual acts described in the short form of the inventory. 
Testing always began with the preferred hand. 


Table 1. Left-hand (LH) and right-hand (RH) scores of unimanual tasks for different handedness 
and FS groups 


Tapping Grip Pegboard Inventory 

LH RH LH RH LH RH LH RH 
Left-handed FS+ — 98.7 92-9 37-0 37:3 6.6 62:8 12-1 3-6 
Left-handed FS- — 92-9 91-0 38-1 37-6 62-4 63-8 12-1 3-7 
All left-handed 95-7 92-0 37.5 37.4 62-0 63:2 12-1 3-7 
Right-handed FS+ 89-0 96-9 33-2 35-5 67-8 61-6 1-9 13-4 
Right-handed FS— 85-4 95-1 35-7 37-8 65-8 62:3 2:2 13-8 
All right-handed 87-0 95-9 34-5 36-7 66-6 62-0 2:1 13-6 


Left- and right-hand scores for each task are shown in Table | for the various handedness and 
FS groups. Results for each task were analysed separately for each handedness group. 
Three-factor Anova analyses (FS, Sex, Hand), with Hand a repeated measure, were computed. 
Among the left-handed subjects, the only significant FS xhand interaction occurred for the 
Tapping Test, the FS+ subjects showing significantly greater (P< 0-05) left-hand tapping speed 
superiority than FS— subjects. Only the Tapping Test and Edinburgh Inventory yielded 
significant left-hand superiorities, a result consistent with previous reports of considerable 
ambilaterality among avowedly left-handed persons (Bentor, Meyers & Polder, 1962; Satz et al. 
1967). The only significant main effect of sex occurred on the Tapping Test (males. faster) and 
grip strength test (males stronger). No interaction of sex with either FS or hand obtained. 
Finally, a multivariate handedness score (sum of the standardized difference scores between 
hands across all tasks for each subject) also failed to show any difference in ‘overall’ 
left-handedness between FS groups. 

For right-handers, significant right-hand superiority obtained on all tasks. There were no main 
effects of FS nor interactions of FS with other factors. As with the left-handers, significant main 
effects of sex were found on Tapping (males faster) and grip strength (males stronger). 

The results for left-handers do not support Hécaen & Sauguet’s (1970) finding of a negative 
relationship between FS and degree of left-handedness. This removes one problem for the 
genetic theories discussed earlier. We would suggest that the relationship reported by Hécaen & 
Sauguet does not obtain among normal subjects and, further, that inquiries regarding both FS 
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and preferred hand for different activities may yield unreliable results from patients with various 
cognitive deficits (such as aphasia and dyslexia) and paralyses. 

At the same time, present results clearly argue against any general positive relationship 
between FS and degree of handedness. Only the Tapping Test showed a significant 
discrimination between FS+ and FS- left-handers. While it is probable that of the tasks 
employed only the Tapping Test absolutely requires contralateral hemispheric control 
(Gazzaniga, Bogen & Sperry, 1967), it is rather surprising that the other tasks revealed no FS 
effects. Annett’s most recent model (Annett, 1972, 1975), which allows a role to cultural 
influences, appears less violated by present results than the Trankell and the earlier Annett 
model. It seems likely that cultural (‘the right-handed world’) influences would be less capable of 
modifying a simple digital speed response than of modifying the grosser, and not necessarily 
contralaterally controlled, manual activities represented in the other tasks. 
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Suppression of response rates in variable-interval schedules by a 
concurrent schedule of reinforcement 


C. M. Bradshaw 





Herrnstein’s equation, which describes the relationship between reinforcement frequency and response rate 
in variable-interval schedules predicts that the introduction of a concurrent source of reinforcement will 
increase the reinforcement frequency needed to obtain the half-maximal response rate, without affecting the 
theoretical maximal response rate itself. This prediction was tested in the present experiment. During Phase I 
four rats were trained to press a lever (lever A) in variable interval schedules, using a range of reinforcement 
frequencies. During Phase II, a second lever (lever B) was introduced; responses on lever B were always 
reinforced at the same frequency, while the reinforcement frequency for lever A responses was varied. 
During Phase I the data obtained from all four rats conformed closely to Herrnstein's equation. Comparing 
the absolute response rates in Phase I and Phase II it was found that the predictions based on Herrnstein’s 
equation were qualitatively confirmed, although there was some quantitative discrepancy between the 
predicted and actual degree of response suppression during Phase II. 





Herrnstein has proposed the following equation to describe the relationship between the 
frequency. of reinforcement and the rate of responding in variable-interval (VI) schedules of 
reinforcement: 


Roa. TA 


R = 
A Katra 


(1) 
(Herrnstein, 1970, eq. 13), where R4 is the response rate and r4 is the reinforcement frequency. 
This equation defines a rectangular hyperbola. 

The constant Rmax (k in Herrnstein's original formulation) is the maximum rate of responding 
which can be generated in a VI schedule by increasing the reinforcement frequency, since when 
Ta > Ky, Ra —> Roax (Herrnstein, 1974). The constant Ky is the reinforcement frequency which 
corresponds to the half-maximal response rate, since when r4 = Ky, Ra = Rmax/2.* Using the 
data of Catania & Reynolds (1968), Herrnstein has shown that eq. (1) accurately describes the 
behaviour of pigeons in VI schedules (Herrnstein, 1970, fig. 8). 

Herrnstein offers an extended version of eq. (1) to describe the relationship between 
reponse rate R4 and reinforcement frequency r4 in the presence of an alternative source of 
reinforcement, delivered at a frequency rz: 

Rmax -TA 


Ra = Smu TA | 
: Kgtrít rg ` Q) 


[Herrnstein, 1970, eq. (15)]. The subscripts A and B designate the two sources of reinforcement. 
The rate of ‘alternative’ responding Rp maintained by rg is specified by a symmetrical equation: 


Rgax- TB 
Roe yin. d 


* Some controversy surrounds the theoretical significance of Ky. Herrnstein (1970, 1974), who uses the 
expression ny, assumes that it reflects the frequency of ‘extraneous’ reinforcement. On the other hand, 
Catania (1973), who uses the expression C, has suggested that it reflects the hypothetical inhibitory effects of 
reinforcement. However, since the empirical validity of eq. (1) is independent of these rival interpretations, 
the neutral expression Ky (Herrnstein's constant) has been adopted in this paper. 
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Equation (2) implies that when the value of rg is held constant, R4 is an increasing function of 
ra, while eq. (3) implies that for a constant value of rg, Rp decreases with increasing values of 
r4. Both these predictions receive empirical support from the data of Catania (1963) (see 
Herrnstein, 1970, fig. 11). 

An interesting prediction which has not yet been tested experimentally can be derived from the 
relationship between eq. (1) and eq. (2). It can be predicted that the introduction of a second 
source of reinforcement, delivered at a constant frequency rs, should suppress the rate of 
responding R4 maintained by the first source of reinforcement. This would be reflected in an 
increase in the value of r4 needed to produce a half-maximal response rate, since from eq. (1) 
Ra = Reax/!2 when r4 = Kg, whereas from eq. (2) Ra = Ruax/2 when r4 = (Kg rg). However, the 
suppressive effects of rg on R4 should be minimal at very high values of r4, since eq. (1) and eq. 
(2) both predict that R4 approaches the same assymptote Rmax- 

The present experiment was an attempt to test these implications of eqs. (1) and (2). 


Method 
Subjects 


Four female albino Wistar rats (Rats 3, 4, 5 and 6), aged about four months and weighing about 200g at the 
start of training, were housed individually and were allowed free access to food for one hour at the end of 
each experimental session. Water was available ad lib in the home cages. 


Apparatus 


The rats were trained in a standard operant conditioning chamber measuring 23 cmx28 cmx19 cm (Grason 
Stadler rat station, model no. E3125C-100). One wall contained a recess into which a dipper mechanism 
could deliver 0-05 ml of liquid reinforcement (Fussell's sweetened condensed milk, diluted one part to three 
parts of water). A lever (lever À) was situated above the recess (7-5 cm above the floor of the chamber). 
This lever could be operated by a force of approximately 22 g (0-22 N). During Phase II of the experiment a 
second lever (lever B) was introduced into the chamber. Lever B was situated to the right of the 
reinforcement recess, and could be operated by a force of approximately 22 g (0-22 N). Illumination was 
provided by a red house light. The whole chamber was enclosed in a sound attenuating chest; masking noise 
was provided by a rotary fan. 

Conventional eletromechanical programming equipment was situated in an adjoining room. Responses and 
reinforcements were recorded on counters and cumulative recorders. 


Procedure 


After acclimatization to the food deprivation regime, the rats were lever trained by the method of successive 
approximations. After three sessions of continuous reinforcement, they were subjected to a series of 
variable-interval schedules as described below. Training sessions took place daily, with few exceptions, at 
the same time each day. Each session was terminated after 50 reinforcements or 90 min, whichever occurred 
sooner. 

Variable-interval schedules were used throughout the experiment. The distribution of the intervals was as 
described by Catania & Reynolds (1968, Appendix II). Throughout this paper the schedules are referred to in 
terms of the average reinforcement frequency (rft/hr) specified by the schedule. (In no case did the delivered 
reinforcement frequency deviate by more than 5 per cent from the scheduled reinforcement frequency.) 

The experiment consisted of three phases, Phase I, Phase II, Phase III; the schedules of reinforcement 
used in the three phases of the experiment are shown in Table 1. 


Phase I. During Phase I, only one lever (lever A) was present in the chamber. The animals were exposed to 
a series of schedules, each specifying a different reinforcement frequency. Training in the presence of each 
schedule was continued until the response rates of all the rats appeared to be stable. (With one exception, 
exposure to each schedule was for 30 days or more.) 


Phase II. A second lever (lever B) was introduced, responses on which were always reinforced at 20-7 rft/hr 
using a variable-interval schedule. Responses on lever A were reinforced according to a series of different 
schedules. Change-over delays were not employed. Exposure to each schedule was continued until the 
response rates of all the rats appeared to be stable (in every case, exposure was for 30 days or more). 


Phase III. Finally, lever B was withdrawn from the chamber and the animals were again exposed (30 
sessions or more) to two different schedules on lever A. 
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Table 1. Schedules of reinforcement used in the experiment 


Reinforcement frequency 





(rft./hr) (VI schedules) 
Phase of ——————————— 
experiment Lever A Lever B Sessions 
I 253.5 — 30 
94-7 — 37* 
20:7 — 21 
10-5 — 30 
7-0 — 30t 
I 253.5 20:7 35 
94.7 20-7 44 
50-0 20-7 32 
33-6 20:7 33 
III 253.5 — 30 
10-5 — 31 


* Due to a technical failure, data for this schedule are missing in the case of rat 6. 
t This schedule did not maintain responding in the case of rat 3. 


Results 
Rate of responding on lever A (Phase I vs. Phase II) 


The mean response rates on lever A, averaged over the last five sessions’ exposure to each 
schedule, were calculated individually for each rat. The results obtained from all four subjects 
are shown in Fig. 1. 

The graphs in the left-hand column of Fig. 1 show response rates (R4) plotted against 
reinforcement frequency (r4). A computer program was used to fit rectangular hyperbolae to the 
data by non-linear regression analysis, and to estimate values of the maximum response rate and 
the reinforcement frequency corresponding to the half-maximal response rate (Wilkinson, 1961). 
The estimated values of the constants, together with their associated standard errors, are shown 
in Table 2. Comparison of the values obtained in Phase I with those obtained in Phase II shows 
that the presence:of lever B in Phase II had no significant effect on the maximum response rate, 
but did result in an increase in the value of r4 needed to obtain the half-maximal response rate. 
(This was true of all four subjects, although the change observed with rat 5 did not achieve 
statistical significance.) 

The data were also analysed by plotting the reciprocal of the response rate against the 
reciprocal of the reinforcement frequency. Both eq. (1) and eq. (2) can be transformed to linear 
functions relating 1/R4 and 1/r,: 


1 _1 Ka 1 


Ra Ta Rox Roax (la) 


d 4 Kg rp. 1 
Ra Ta Rax Russ (2a) 


(Lineweaver & Burk, 1934; Cohen, 1973). Correspondence of the data, in double reciprocal 
form, to a straight-line function is a measure of the degree of conformity of data to eqs. (1) and 
(2). Double reciprocal plots of the results obtained from all four subjects are shown in the 
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Figure 1. Relationship between response rate on lever A (R4) and reinforcement frequency for responding on 
lever A (r4) for all four subjects. Closed circles, data obtained during Phase I; open circles, data obtained 
during Phase II; closed triangles, data obtained during Phase III. 

Left-hand graphs: R4 vs. r4. Points are mean response rates (+s.e.m.) for last five sessions’ exposure to 
each schedule. Curves were derived by non-linear regression analysis. 

Right-hand graphs: 1/R, vs. 1/r4. Lines were derived by linear regression. 


Table 2. Values of the constants obtained during the two phases of the experiment 


Reinforcement frequency 


corresponding to half-maximal 
Maximum response rate (responses response rate (reinforcements 
per minute, +s.e. est.) per hour, —s.e. est.) 
Rat Phase I Phase II Phase I Phase IT 
3 50-7 (5:2) 61-0 (x8-0)t 28-9 (+10-5) 207-0 (+49-7)** 
4 49-8 (x4-2) 58-1 (x8-0)t 35-4 (+9-6) 123-0 (+36-9)* 
5 49-1 (44-9) 41-1 (+3-5)t 37-4 (+11-3) 49-4 (x12-1)t 
6 36:6 (2-0) 39-6 (+1-6)f 30-7 (x4-8) 93.8 (+8-3)*** 


Significance of difference (t test): * P< 0-025; ** P<0-01; *** P«0-0005; t n.s. (P> 0-1). 
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Table 3. Rates of responding on lever B (Rg) during Phase II. Lever B responses were reinforced 
at a constant frequency of 20-7 rft./hr. The reinforcement frequency for lever A responding (r4) 
was varied. Values are mean rates (+s.e.m.) over last five sessions’ exposure to each schedule 


—— — ——————— Ó e——— eR 
Value of r4 (rft./hr) 





Rat 33-6 50-0 94-7 253-5 





3 4-7 (+0-5) 5-2 (+0-1) 8-7 (+0-6) 3-1 (+0-3) 
4 6:2 (0:3) 4:1 (+0-2) 68 (+0-3) 4-8 (+1-1) 
5 6-2 (+0-3) 8.1 (x0:2) 5-4 (+0-2) 3-0 (+0-3) 
6 |  55(x*04) 6:1 (x0-8) 4-9 (+0-4) 5-6 (x02) 





right-hand column of Fig. 1. In every case the linear regression coefficient was greater than 0-96. 
Inspection of the double reciprocal plots also shows that the lines derived from Phase I and 
Phase 1I intersect the ordinate at the same point (indicating that the values of the maximum 
response rates were unchanged during Phase II), while the slopes of the lines were steeper 
during Phase II (indicating an increase in the values of r4 corresponding to Rmax/2). 


Response rates on lever A and lever B (Phase II) 


Response rates on lever B during Phase II are shown in Table 3. In the case of all four animals 
the response rates were low, and there was no consistent tendency for the values of Rg to 
decrease with increasing values of r4. 

Figure 2 shows the relative rates of responding on lever A [RA/(R4- Rg)] plotted against the 
relative frequency of reinforcement for responding in lever A [r4/(r44-rg)]. The best-fit regression 
line had a slope of 0-810 and intersected the ordinate at +0-146. The regression coefficient was 
0-976. 


Reversibility of suppression of response rates on lever A (Phase III) 


The response rates on lever A observed during Phase III are shown in the left-hand column of 
Fig. 1 (closed triangles). In the case of all four rats there was no statistically significant 
difference between the mean response rates obtained during Phase III and the corresponding 
mean rates obtained during Phase I (t test, P> 0-1). 


Discussion 


The data obtained from all four rats during Phase I conformed closely to eq. (1) (see Fig. 1, 
closed circles). Thus the present results obtained with rats are in agreement with earlier 
observations made on pigeons (Catania & Reynolds, 1968; see Herrnstein, 1970). During Phase II 
there was approximate matching between relative response rates and relative reinforcement 
frequency (see Fig. 2). Thus the present results join the body of evidence which supports the 
Matching Law (see Herrnstein, 1970). 

In analysing the present data, response rates were plotted against scheduled (rather than 
delivered) reinforcement frequency. Although it is usual in this kind of study to use delivered 
reinforcement frequency, it has been suggested that this practice may result in misleading 
conclusions. For example, Mackintosh (1974) has pointed out that in concurrent schedules 
apparent matching may be obtained even if the subject responds hardly at all in one component, 
since a very low response rate may result in a diminished reinforcement frequency in that 
component. However, these considerations are not crucial to the present experiment where the 
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Figure 2. Relationship between relative rate of responding and relative frequency of reinforcement during 
Phase II. Data from the four subjects have been pooled. 


delivered reinforcement frequency in any schedule never deviated by more than 5 per cent from 
the scheduled reinforcement frequency. 

In both Phase I and Phase II the schedules were presented in the order of decreasing 
reinforcement frequency. Thus it is possible that an order effect might have interfered with the 
present data. That this was not the case, however, is shown by the results of Phase III, where 
the response rates on a high and a low density schedule did not differ significantly from the 
response rates observed on the same schedules during Phase I. 

In contrast to the observations of Catania (1963), there was no consistent inverse relationship 
between Rp and r4 during Phase II. Thus the present data do not conform to eq. (3). The reason 
for this discrepancy is not clear. However it should be noted that throughout Phase II the value 
of rą was considerably greater than that of rg. Under these conditions, eq. (3) predicts that 
relatively large increases in the value of r4 will be reflected in only small reductions in the size 
of Rg. 

The main purpose of this experiment was to determine in what way the response rates on lever 
A were affected by the introduction of the alternative source of reinforcement rg. On the basis 
of Herrnstein's formulation, variables which suppress responding in VI schedules can be 
assigned to one of three possible categories: (1) those which reduce the maximum response rate; 
Q) those which increase the reinforcement frequency needed to obtain the half-maximal 
response rate; (3) those which do both. The present results indicate that an alternative 
(concurrent) source of reinforcement belongs to the second category. This is in agreement 
with the predictions derived from eqs (1) and (2). 

There is, however, some quantitative discrepancy between the present results and those 
predicted by eq. (2). This discrepancy is illustrated in Table 4. Eq. (2) predicts that during Phase 
II the value of r4 corresponding to R4,,/2 should have increased from Ky to (Ky rg). In the 
case of three of the four rats, however, considerably greater increases were seen than could be 
predicted by eq. (2). A possible explanation for this discrepancy could lie in Herrnstein's 
interpretation of Kg ('r,', see Herrnstein, 1970). It is possible that the introduction of the second 
lever during Phase II, in drawing the animal to the right-hand side of the chamber, produced an 
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Table 4. Observed and predicted values of r4 corresponding to Rmox/2 


Value of r4 corresponding to Rmax/2 


Phase II 
Predicted 
Phase I from eq. (2) 

Rat (= Ky) (= Kg rg) Actual 
3 28-9 49-6 207-0 
4 35-4 56:1 123-0 
5 37.4 58.1 49-4 
6 30-7 514 93.8 


increase in the frequency of unscheduled reinforcement (and hence an increase in the value of 
Kp).* However it is also possible that the greater-than-predicted response suppression during 
Phase II is attributable to the alternative schedule of reinforcement itself. Further experiments 
are needed to resolve this issue. 

Cohen (1973) has pointed out that eq. (1) is mathematically identical to the Michaelis-Menten 
equation of classical enzymology, which describes the relationship between the concentration of 
a substrate and the velocity of an enzyme-catalysed reaction: 


0o Vinax 5 
E Kuts ] (5) 





y 


where v is the reaction velocity, s is the substrate concentration and Vmax and Ky are the 
maximum velocity and the Michaelis constant respectively. It may be of interest to note that 
the suppression of responding by an alternative source of reinforcement is mathematically 
analogous to the phenomenon of competitive inhibition in enzymology. A competitive 

inhibitor of an enzymatic reaction is defined as a compound which increases the apparent value 
of Ky without affecting the value of Vmax (see Mahler & Cordes, 1966). 
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The Jonckheere test for matched scores with unequal cell frequencies 


Chris Leach 


Meddis (1975) has suggested a test that is applicable in the situation where repeated measures are 
obtained from a number of subjects run under two treatment conditions and where the number 
of scores in each subject treatment combination may be unequal. As pointed out by Sykes 
(1976), the Meddis test is identical to one proposed by Wilcoxon, which is discussed in Bradley 
(1968, pp. 115-117). The test is also almost identical to a special case of a test suggested by 
Jonckheere (1954 b). Because the statistical properties of the Jonckheere test are well understood 
and because the test has been generalized to cover a large class of situations which are 
commonly encountered by psychologists, this version of the test is to be preferred. 

The Jonckheere test will be illustrated with Meddis's example data, which are reproduced in 
Table 1. The computations are particularly straightforward in this case, since there is no need to 
rank order the data. First, a Mann-Whitney U statistic is computed for each subject's data 
separately, by counting the number of observations under treatment II which are smaller than 
each of the observations under treatment I and summing these to obtain U. It is important that 
U be computed in the same way for each subject. Each of these Us is then converted into a 
z-score in the usual way (for example, Formula 6.9 in Siegel, 1956, p. 124), making use of the 
correction for ties where necessary in the calculation of the variance. The test statistic is then 
simply 


Z=} ak, (1) 
1=1 


where k is the number of subjects and z is the z-score for subject i. Since each z is 
approximately a unit normal deviate, the same will be true of Z. The approximation can be 
improved for small k by a continuity correction, taking 


2=7-[ISal-7 2 3 | o 


as the test statistic, where ø, is the standard deviation of U for subject i (Jonckheere, 1970). If c, 
is the same for each subject, this test statistic is identical to the statistic suggested by Meddis, 
and it may be computationally simpler to use Meddis’s formula. In other cases, the two test 
statistics will differ somewhat. Table 1 shows the calculation of Z and Z.. The obtained Z of 2-64 
is slightly larger than the uncorrected value of 2-53 obtained using the Meddis statistic. The 
difference in these two values is due both to the variances for different subjects not being the 
same in this example, and to the fact that the Meddis statistic does not include a correction for 
ties. However, in this example, the conclusions from both tests are identical. 

The distribution of Z converges extremely rapidly to a normal distribution, so that exact 
tables are unnecessary for most purposes. The fit depends both on k, the number of subjects, 
and on the number of observations under each subject/treatment combination (Jonckheere & 
Bower, 1967). This makes the test particularly useful when a large amount of data is obtained 
from relatively few subjects, as frequently occurs in animal learning experiments. 

A further advantage of the Jonckheere test is that it allows a check on the homogeneity of 
subjects' results. If different subjects are showing strong trends in opposite directions, the 
overall test will be non-significant. Since we have computed a z-score for each subject, however, 
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Table 1. Computation of Z and Z, illustrated with data from Meddis (1975) 





Treatment 

Subject I II U Ih ei 1/0, 1 

1 59 54, 54, 72 2 1-5 1-061 0-943 0-471 

2 63 52, 54, 56 3 1-5 1-118 0-894 1-342 

3 53, 66, 76 48, 52, 57 8 4-5 2-291 0-436 1:528 

4 70 50,51,51,66 4 2-0 1:378 0:725 1:451 

5 62, 64, 86 66 1 1-5 1:118 0-894 —0-447 

6 70, 82 68 2 1-0 0-816 1:225 1-225 

7 68, 77 62, 64, 64,77 65 4-0 2-098 0-477 1-192 

8 69, 73, 73,77 56, 64, 79 8 6-0 2:803 0:357 0-714 
5.95] 7-476 


The y, and c; are computed as described in Siegel (1956, pp. 121 and 124). 


2H EUN 
Z= aie rad ge [e 3221] as. 


we can easily check on this, either by inspection or by making use of a further test suggested by 
Jonckheere & Bower (1967, p. 182). This test involves computing the statistic 


WzEz-Z-E(ui-Zy, (3) 


where Z is defined at (1) and Z is the mean of the zs. Now, though W is approximately 
distributed as x? with k—1 d.f., the approximation will be good only if each of the individual zs 
is also approximately normally distributed; it is not improved by using a large number of 
subjects, as is the case with the combined test of formula (2). For this reason, W could not be 
used for the data in Table 1, since there are too few observations under each subject/treatment 
combination. Inspection of the z,s for these data reveals that subject 5 shows a slight trend in the 
opposite direction from all the other subjects. 

The application of the Jonckheere test described here is merely a special case of a test that is 
applicable to a large variety of situations. The test is an extension of Kendall’s coefficient of 
association, and is based on the idea that several other tests may be considered to be special 
cases of the Kendall test when there are extensive ties in one or both of the variables being 
correlated. Thus, the Mann-Whitney test can be conceived of as a test of association between 
the treatment variable and the response variable (Kendall, 1970, p. 42) and the statistic U is a 
simple linear transformation of the statistic S usually computed for the Kendall test. 

The conclusions drawn above for the situation discussed in this paper are all direct 
implications of the general results about the distribution of S investigated in a series of papers by 
Jonckheere (Jonckheere, 1954a, b, 1970; Jonckheere & Bower, 1967). These papers also contain 
information about the power of S under certain alternative hypotheses. For an illustration of the 
wide applicability of the statistic S, an unpublished paper (Leach, 1975) is available from the 
author on request. The monograph by Ferguson (1965) also illustrates many of the uses of S. 
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Absolute pitch: A case study 
Philip E. Vernon 


The auditory skill known as ‘absolute pitch’ is discussed, and it is shown that this differs greatly in accuracy 
of identification or reproduction of musical tones from ordinary discrimination of ‘tonal height’, which 1s to 
some extent trainable. The present writer possessed absolute pitch for almost any tone or chord over the 
normal musical range, from about the age of 17 to 52. He then started to hear all music one semitone too 
high, and now at the age of 71 it is heard a full tone above the true pitch. Tests were carried out under 
controlled conditions, in which 68 to 95 per cent of notes were identified as one semitone or one tone higher 
than they should be. 

Changes with ageing seem more likely to occur in the elasticity of the basilar membrane mechanisms than 
in the long-term memory which is used for aural analysis of complex sounds. Thus this experience supports 
the view that some resolution of complex sounds takes place at the peripheral sense organ, and this provides 
information which can be incorrect, for interpretation by the cortical centres. 


Most members of the British Psychological Society would probably doubt whether anything of 
scientific value could be derived from the introspections of a single subject. However the . 
experiences reported below have not, so far as I know, been previously described by any 
psychologist or musician; and they raise some points of rather crucial interest for theories of 
auditory perception. 

Ward (1963) and Carroll (1975) have provided excellent reviews of the quite extensive 
literature, from Stumpf (1883) onwards. They define absolute or ‘perfect’ pitch (AP) as the 
capacity to name a musical tone heard in isolation. Usually, though not necessarily, the 
possessor can also, when given the name of a tone, reproduce it by singing, or adjusting some 
form of oscillator. It is quite a rare phenomenon; at a rough guess it occurs among less than 5 
per cent of trained musicians. Only a few hundred cases in all have been reported on in the 
literature, and the credentials of some of these were doubtful. ‘Perfect’ does not, of course, 
imply correct to the nearest cycle of frequency (c/sec.); but it does mean within some fraction of 
a semitone, say one-quarter. Thus if A-4 (the A-string on the violin) normally corresponds to 
440 c/sec, and B flat, one semitone above, is 460, the AP listener might well become confused 
on hearing 450 c/sec, but should certainly identify 445 with A rather than B flat. 

Exact definition of AP is difficult, since different investigators have used a variety of 
techniques. Sometimes the subject is told to adjust an oscillator until the pitch reaches a 
required note, or to judge which of a series of closely spaced tuning forks is correct. Since, 
however, pure (sinusoidal) tones may be more difficult to identify than the complex tones 
normally occurring in music, it seems more appropriate to present stimuli by a piano or organ, 
and ask for their identification by name; or, as Carroll did, to seat the subject at a second piano 
and ask him to replicate each of a random series of heard notes. 

There has been much discussion regarding the origins of AP. Some, like Bachem (1955), 
regard it as a genetic ‘gift’ which appears some time in childhood without any specific training, 
though he admits that considerable musical interest and experience are necessary. Others have 
argued that it is trainable, and that possessors of AP merely represent the extreme top end of a 
normal distribution. But the facts are against them. Even non-musicians can, with a good deal of 
practice and feedback, reduce their range of errors of identification to less than an octave on 
either side of the given note. In other words, they can learn to distinguish something like 9 to 13 
different pitches spread out over the range of six octaves or so (which encompasses most of the 
commonly used musical scale). But persons with AP can identify something like 60 different 
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semitones, though admittedly they vary in their accuracy from 100 down to 50 per cent. Both 
Ward and Carroll have discussed the issue in terms of information theory, and conclude that the 
ordinary listener perceives about 214 bits, whereas a skilled AP person is perceiving 61^ bits (at 
least over most of the range; at the extremes of the pianoforte scale, accurate judgement 
becomes more difficult). 

Révész in 1913 made the rather important distinction between ‘tone height’ and ‘chroma’. 
Height is the approximate level on, say, the piano scale, from bottom to top, whereas chroma is 
cyclic, i.e. the particular note within any one octave. AP individuals not infrequently identify the 
correct chroma but misidentify the octave in which the note falls. Presumably it is tone height 
rather than chroma which is to some extent trainable. 

Ward asks whether AP is any more remarkable than the capacity that everyone possesses to 
identify accurately large numbers of different people by their faces, or a lot of different noises. 
Surely though the point is that AP has so little stimulus difference to go on. If two notes are a 
semitone apart, but identical in intensity, timbre or any other characteristics, they differ only in 
one having 5-9 per cent greater c/sec than the other. Ward suggests too that many musical 
children could be trained to accurate identification if young enough not to have become 
oversensitized to relative pitch (this means the capacity of the musician, when given a starting 
tone, e.g. A-4, to sing F-4, or any other interval, with a high degree of accuracy). But Ward’s 
evidence is very meagre, and it is not true — as he claims - that every person with AP has 
received practice and reinforcement in note identification during childhood. 

Finally it is interesting to note that AP is not essential to musicianship nor, apparently, is it 
associated with any other special skill. While it can be useful to brass instrument players who 
want to hit the right harmonic, it can be a handicap in other circumstances, particularly to 
singers, or as will appear in my own experience, below. 


Personal history 

To the best of my knowledge, I received no specialized training other than ordinary piano 
lessons from 6 years, and singing at church, school, or at home. At about the age of 17, I noticed 
rather suddenly that I was able to recognize the key of any ordinary (yet unfamiliar) musical 
excerpt, when played on a piano, organ, or by an orchestra. I seemed to perceive each key as 
having a distinctive quality or timbre, which differentiated passages or chords in the key of C 
from those in C sharp or B. It followed that, by using relative pitch, I could easily identify by 
name any note or phrase in a not too complex piece of music. A few years later I could also 
successfully accomplish the parlour trick of identifying any four notes, however discordant, 
played simultaneously near the middle of the piano keyboard. As generally occurs, I had greater 
difficulties in identifying notes or chords either in the top two or bottom two octaves of the 
piano scale. I also found the pitch of vocal music more uncertain, partly, at least, because 
unaccompanied choirs or soloists are very apt to rise or fall in pitch to some extent; whereas 

- keyed instruments stay constant. 

This perceptual skill continued unabated until the age of 52, when I began to notice a tendency 
to identify keys as being one semitone higher than they should be, and this soon became 
stabilized (except when playing the piano or organ myself, and could, as it were, force myself to 
hear C as C, not as C sharp). This was highly disconcerting, since I also tend to associate 
particular keys with particular moods - a kind of synaesthesia (Vernon, 1930). C major to me is 
strong and masculine, whereas C sharp is more lascivious and effeminate. Hence it was 
distressing to hear, for example, the Overture to Wagner's Mastersingers of Nuremberg — an 
eminently C major composition — apparently being played in C sharp. Thus when I was singing in 
church, or a chorus, or listening to music at concerts, I habitually transposed a good deal of the 
time. That is, I inferred that what I heard as C sharp or D was really C and C sharp 
respectively. 


Absolute pitch: A case study 487 


Now at the age of 71 the inevitable seems to be happening, and I usually identify notes or 
keys 2 semitones (one whole tone) too high. Wagner’s Overture is now quite clearly being played 
in the key of D! Actually I fluctuate between the semitone and full tone (especially when playing 
the piano), but it seems probable that the latter is becoming predominant. Also if I live long 
enough it may well happen that my AP will become 3 or even more semitones ‘out of gear’. 


Experimental tests 

With the publication of Carroll’s monograph, I decided to apply his test to myself. He had four 
AP subjects, with a fifth one who claimed to have trained himself in true chroma AP, and four 
other musicians who did not have AP. In his main series of trials, the 64 notes falling between 
A-1 and C-7 were played in randomized order, except that no two consecutive stimuli were a 
whole tone or semitone apart at any octave (e.g. D-3 would not be followed by E-2). The stimuli 
were sounded on a piano at 15 sec intervals, and the latency of each subject's response - the 
time taken to move his hand and play the note he identified - as well as the accuracy, were 
recorded. However latencies revealed no notable differences between the AP and NAP subjects. 
The experimental trials were given four times in all. With the aid of my wife at the piano, and 
Mr Jeff Campbell, a local organist, at the organ, I carried out three trials, and wrote down each 
note I heard on music manuscript paper, with the results shown in Table 1. 


Table 1. Percentage distribution of responses to tonal stimuli 


Trial 3 
Trial 1 Trial 2 Piano Lowest 
Piano Organ chords 48 chords 
One tone sharp 25-0 40-7 18-8 25.0 
Semitone sharp 62-5 50-0 59-3 70-8 
Correct note 10-9 4-7 6:3 2:1 
Miscellaneous errors 1-6 47 15-6 2-1 


Disregarding octave errors for the moment, Carroll’s average AP subject scored 71 per cent 
correct chroma, range 48-0-92-8 per cent. The mean for NAP was 12-5 per cent, and the highest 
scorer got 14-8 per cent. In my own case, if the first two categories (semitone or tone too high) 
are accepted, I scored 87-5 per cent in the first trial, with 12-5 per cent chroma errors. Thus I 
fell near the top of Carroll’s APs, except for wavering between semitone and tone. Since I 
suspected that I would more consistently judge one tone too high on an instrument other than 
the piano, Trial 2 was performed on a rather reedy combination of 8 ft organ stops. It may be 
seen that there were still rather more semitone than full tone judgements, but the total ‘correct’ 
rises to 90-7 per cent. 

The experiment was repeated once more on the piano, but with three-note major chords (e.g. 
C-4, E-4, G-4) instead of single notes, on the supposition that I identified more by key-quality 
than by pure note chroma. In fact, however, there were now more errors, namely 21-9 per cent, 
since in the top two octaves I sometimes heard such a chord as based on G or E instead of C. 
Taking only the lower 48 notes (A-1 to A flat-5), I made only two errors. Thus if one semitone or 
one tone are allowed, I achieved 95-8 per cent correct. 

My numbers of octave errors were similar to those noted by Carroll, who obtained a mean of 
5; mine were 4, 6, 5 and 1 respectively. 
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Auditory theory 

Despite a very large amount of physiological and experimen:al psychological investigation (cf. 
Tobias, 1970), there is still much uncertainty as to how the aural system perceives musical 
stimuli consisting of complex combinations of notes, simultaneously or successively. The 
original resonance or ‘harp’ theory of von Helholtz has been abandoned since von Békésy (1928) 
and others have shown that the basilar membrane is an elastic surface, nothing like a series of 
strings under tension. Apparently any single tone stimulus sets up a travelling wave moving from 
the base to the apex of the membrane, and the wave pattern, particularly the region of maximum 
displacement, which stimulates the hair cells, does differ according to the pitch of the tone. But 
this could not possibly account for the resolution, by a musical listener, of a chord into a set of 
numerous notes at different pitches, which are recognized as emanating from several different 
orchestral instruments, each with a different timbre. Moreover each note involves different sets 
of partials or overtones and formants, and any such stimulus is likely to produce a number of 
difference or combination tones. As I argued in 1935, musical perception must depend more on 
analysis at the cortical than at the peripheral level. 

On the other hand the so-called ‘telephone’ type of theory, first proposed by Rutherford, 
which regards the cochlea merely as a kind of microphone for transmitting all frequencies up to 
the auditory cortex, runs into difficulties because of the very high frequencies of some of the 
components of musical stimuli or noises, e.g. exceeding 300) c/sec; though it seems possible that 
through synchronous firing of numerous neurones (the ‘volley’ theory) the auditory nerves might 
be able to transmit very high frequencies. Apparently the majority of writers now agree that 
there is some resolution or breakdown of complex auditory stimuli at the cochlea, but also 
transmission to higher centres which undertake the analysis of these complex Gestalten into 
their musical components. Some would hold that different mechanisms operate for pitches within 
the musical range, e.g. up to about 2000 c/sec, and a different one for the higher wave bands 
which carry the upper partials and formants. The phenomenon of absolute pitch must imply that 
long-term storage somehow carries a set of ‘templates’ corresponding to chroma and octave 
height, which make possible the immediate identification of 60 or more notes of different pitches. 
This is a particularly subtle form of discrimination, because of the paucity of physical 
differences between adjacent stimuli, noted above. 

Nevertheless it is here that my own experience of a change in pitch recognition with ageing 
appears relevant. It seems fairly plausible that, with advanc:ng years, the basilar membrane 
might become somewhat less elastic. Hence that section of the membrane which was originally 
associated, say with middle C, would now react maximally to pitches a semitone or two lower, 
i.e. B or B flat, which are therefore interpreted as C. While we cannot rule out the alternative 
possibility that neurological or biochemical changes have affected the central mechanisms of 
auditory analysis and identification, this seems to me much ‘ess likely. I cannot think of any 
parallel instance in visual or verbal thought of slow, unidirectional, change with age in a 
structured set of long-term memory templates. 

Although I stated at the outset that no similar experiences had come to my notice, there is one 
relevant study by Wynn (1972). He describes two female and one male with absolute pitch, whom 
he tested repeatedly over two to three months, getting them to sing the note À-4, or to adjust an 
audio-oscillator to this standard pitch. The females showed cyclical fluctuations in their average 
readings, amounting to about 8 c/sec from peak to trough in one subject, and 13 c/sec in the 
other. These occurred every two weeks, corresponding to the onset of menstruation and to 
ovulation. He suggests therefore that biochemical changes affecting the cochlea mechanisms, 
rather than physical changes, are responsible. The male subject too showed some suggestion of 
cyclical variations at about a 20-day rhythm. Wynn also mentions some longer term drifts in 
base-line AP associated with illness or possibly with an ageing effect. One subject, over seven 
years, dropped her A-4 from 460 to 440 c/sec, which is close to a semitone. 
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In summary: this case study indicates that the auditory mechanisms whereby the pitches of 
musical notes can be accurately identified, are originally dependent on, but can later deviate 
from, the actual pitch as registered at the cochlea. Possibly other psychologists, with greater 
expertise than myself in auditory theory, can collect additional cases and arrive at better 
hypotheses regarding the physiological explanation of the phenomena. 
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Characteristics of items in the Eysenck Personality Inventory which affect 
responses when students simulate 


R. P. Power and K. D. MacRae 





A large sample of students completed Form A of the Eysenck Personality Inventory, and four subgroups 
were later asked to simulate extraversion, introversion, neuroticism or stability. It was found that subjects 
could simulate these four personalities successfully. The changes in individual item responses were 
correlated with the items’ factor loadings, validity, response bias, and detectability. The different scales and 
types of item were considered separately. In some cases the changes in item responses when simulating 
introversion and extraversion were related to the extraversion validities and factor loadings of the items 
More often, however, the behaviour of items under simulation was correlated with aspects of the items that 
made them more like an item from another scale and thus lessened their susceptibility to a particular type of 
simulation. 


During the last few years we have carried out studies on various aspects of personality 
inventories. It has been shown, for example, that subjects can simulate a variety of personality 
characteristics when completing personality inventories (Power, 1968; Power & O'Donovan, 
1969); and that they can detect the scale to which individual items belong, the detectability of 
items as being of a certain type being related to the factor loadings of the items (Power & 
MacRae, 1971). It has also been demonstrated that when a suitable decision rule is available 
inventories can be quite powerful diagnostic tools (Power, MacRae & Muntz, 1974), and that, 
given adequate normative data, clinical psychologists can approach the accuracy level achieved 
by discriminant function analysis (Power, Muntz & MacRae, 1975). Recently, an analysis 
(MacRae & Power, 1975) of the individual items of the Eysenck Personality Inventory (EPI) 
(Eysenck & Eysenck, 1964) has shown that this inventory contains items which differ 
considerably in validity, some items being more closely associated with the total score on one of 
the other two scales than their own (as defined by the Eysencks’ scoring key). On the basis of 
this analysis it is now possible to construct alternative scoring keys for the EPI, particularly 
ones which achieve a better balance between yes and no responses than exists at the moment. It 
would also be possible to construct very much shorter versions of the EPI, choosing only the 
items which are highly correlated with the total scores on the three scales. 

So far little attention has been paid to the discovery of which items are most susceptible to 
change of response when subjects are simulating, or to determining the characteristics of items 
which are most affected by attempts at simulation: It might be considered that we have already 
partially answered this question. In one study (Power & MacRae, 1971) we discovered 
substantial correlations between detectability of the items and their factor loadings when we 
ignored the scale to which the items were allocated by the Eysencks. When the 
introversion-extraversion (IE) scale was considered the correlation between factor loadings and 
detectability remained much the same, but it vanished when the neuroticism-stability (NS) scale 
was considered. In so far as this study is an answer, it tells us something about the IE scale 
items, but nothing useful about the NS items. The reason for not being sure that it is an answer 
at all is that it is not clear that detecting items (i.e. going through the inventory and saying to 
which scale, and whether a yes or no answer, means a score of one added to the introversion, 
neuroticism or lie score) is the same as simulating a particular personality. Certainly there is 
some evidence (MacRae & Power, 1976) that the two tasks are more different than would, a 
priori, be expected. The study to be reported here, then, was designed to establish which items 
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change when subjects simulate, whether these items are the most easily detectable, their 
relationship to the factor loadings of the items, and the response bias of the items. 

A large group of subjects was given copies of the EPI (Form A) and was asked to complete it 
in the usual fashion. Subgroups were then asked to simulate one of four personalities 
(introverted, extraverted, neurotic or stable) when completing the inventory. The extent to which 
responses to the individual items changed under the simulation conditions was then related to (1) 
the factor loadings of the items, (2) the initial response tendency or bias (i.e. frequency of saying 
yes to the items in the control condition), (3) the previously determined detectability of the items 
as being items of a certain type, and (4) the validity of the items as predictors of total scores on 
each of the three scales of the EPI, extraversion, neuroticism and lying (attempting to simulate a 
conventionally desirable personality). 


Method 
Subjects and procedure 


A group of 259 students was asked to complete Form A of the EPI during one of their first laboratory 
classes This large group was subdivided, and four of the subgroups provided the simulation data reported 
here. A group of 50 students was asked to simulate extraversion (E), a group of 53 students was asked to 
simulate introversion (I), a group of 40 students was asked to simulate neuroticism (N) and a fourth group of 
36 students was asked to simulate stability (S). The instructions were based on those used by Power (1968). 


Design and analysis 


The questionnaires completed by the large group of 259 students provided control data for three purposes: 
(1) to obtain values for this sample on the three scales, (2) to establish the percentage of subjects saying yes 
to each of the 57 items of the EPI, and (3) to establish the validity of each of the items on each scale, where 
validity refers to the extent to which the item predicts the total score on each scale, rather than predicting 
some external criterion. The procedure has been described in greater detail (MacRae & Power, 1975), but 
essentially it consisted of determining the mean scores on E, N and L for those subjects who said yes to 
each item, and the mean scores of the same variables for the subjects who said no to each of the items. 
Student's t tests were then carried out to discover whether, for each item, saying yes or no to it led to a 
significantly higher or lower score on each of the variables. The value of t was used as a measure of validity. 

The factor loadings of the items on F1 (N) and F2 (E) were obtained (Eysenck, Hendriksen & Eysenck, 
1969), as was the detectability of the items as being items of one of the three types (MacRae & Power, 
1976). The procedure for discovering the detectability of the items has been described before (Power & 
MacRae, 1971), but the main point is that naive subjects were asked to go through the inventory and specify 
to which of the three scales each of the items belonged. 

The means on each of the three scales for each of the four subgroups were obtained in order to establish 
how successfully the subjects could simulate the specified personality. 

The four simulating subgroups were chosen to represent the polar opposites on the two personality factors 
measured by the EPI. The susceptibility of each item to change was established by subtracting the 
percentage of people who said yes when simulating I from the percentage of people who said yes when 
simulating E. A similar procedure was carried out for the NS dimension. This calculation will be carried out 
for item one in order to illustrate the procedure. When E was being simulated the percentage of yes 
responses was 84 per cent and when I was being simulated the percentage was 30 per cent. The IE shift in 
yes responses for item 1 is, therefore, 54 per cent, the maximum shift being, of course, 100 per cent and the 
minimum 0 per cent. For this item when subjects were simulating N the percentage of yes responses was 73 
per cent and when the subjects simulated S the percentage was 22 per cent. The percentage shift in response 
to NS simulation is, therefore, 51 per cent. (Question 1 1s an item in the E scale, a yes response adding 1 to 
the score on E.) 

Finally, the items were considered separately according to their recommended scoring (Eysenck & 
Eysenck, 1964). There are three scales, E, N and L, with the E and N scales having 24 items each and the L 
scale nine items. All the N items are scored in the same way, a yes response adding 1 to the N score. The N 
items will therefore be referred to as N-yes items. The 24 E items are scored in two ways: for 15 of them a 
yes response adds 1 to the E score (E-yes items) but for the remaining nine items a no response adds one to 
the E score. The latter set of items will be referred to as E-no items. In the same way there are three L-yes 
items and six L-no items in each form of the EPI. 
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Results and discussion 


It can be seen from Table 1 that the subjects have successfully simulated each of the four 
personalities, E, I, N and S, the mean scores departing considerably from those of the control 
group values. 


Table 1. Number of subjects (n), means on the three scales extraversion (E), neuroticism (N) 
and lying (L) for the five groups: non-simulating, and simulating extraversion (E), introversion 
(D, neuroticism (N) and stability (S) 


n E mean N mean L mean 
Non-simulating 259 12:3 9-4 2:3 
Simulating E 50 23-1 45 1:3 
Simulating I 53 1-6 167 4-1 
Simulating N 40 10-7 21-7 1-7 
Simulating S 36 9-7 3-4 3-8 


The E mean for the non-simulators is 12-3, which rises to 23-1 for E simulation and falls to 1-6 
for I simulation. As usual, the induced production of high E reduces the N score and of high I 
increases the N score. The L score also reduces for E simulation and increases for I simulation. 

Simulation of N and S also produce large changes in the relevant scale, the control mean of 9-4 
increasing to 21-7 under N simulation and reducing to 3-4 under S simulation. The E scale does 
not show a large effect as a result of NS simulation. The group simulating N has a lower L mean 
than the group simulating S. 

The fact that simulation on the IE dimension leads to big changes in the scores on the N scale, 
but simulation on the NS dimension leads to small changes in the E scale, suggests that in the 
minds of subjects introversion implies neuroticism but not vice versa. The changes on the lie 
scale fit patterns previously established with children (Power & Brown, 1973; Power & Stoppard, 
1973). Subjects simulating S have high lie scores, as would be expected, since the scale is 
designed to detect people presenting themselves as desirable. Subjects simulating I also have 
high lie scores, since many of the items reflect obsessoid behaviour (Power et al. 1974), which 
seems to be related, at least in the minds of subjects, to introversion. Subjects simulating E have 
low scores on the lie scale, as do those simulating N. It is worth mentioning again, although the 
point has been made before (Power et al. 1974), that the pattern of scores of genuine neurotics 
is different from those simulating neuroticism: genuine neurotics tend to have high N scores and 
high L scores. 

A summary of the behaviour of the individual items is given in Table 2. 

The N-yes items show more of a change (77 per cent) with NS simulation than they do with 
IE simulation, although the IE change is over 50 per cent. This is a further example of the 
previously noted (Power et al. 1974) apparent belief of subjects that introversion implies 
neuroticism. The validity analysis of the N items shows the highest average t for the N scale 
with the ts for the other two scales being very low. The detectability data (MacRae & Power, 
1975), show that just over half (56 per cent) of the subjects could state accurately the scale to 
which the N items belonged. At first sight this is lower than might be expected, when the 
success of the simulation is considered. The detectability task however, is more difficult than the 
simulation task, since the subject has to state exactly what type of item each question is. When 
simulating, on the other hand, the subject does not have to decide whether the item is, say, more 
N than E, if there is any N at all, and if he is simulating N, then he can say yes, even though 
the item has a lot of E to it. If he is correct in his guess the simulation increases the intended 
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Table 2. Summary of data on the items. 

(The type of item, the number of such items (i), the mean percentage change in 'yes' responses between 
simulating introversion and simulating extraversion (IE) and between simulating neuroticism and simulating 
stability (NS), the mean percentage of ‘yes’ responses in the non-simulating sample (Y), the mean factor 
loadings on F1 (N) and F2 (E), the mean percentage of correct detections of the items (DET), and the mean 
t values (ignoring sign) between the ‘yes’ and ‘no’ responders on each of the three scales (tE, tN and IL, 
respectively).) 





Type i TE NS Y F1 (N) F2(E) DET tE iN tL 

N-yes 24 56-5 77.0 39-3 0-41 0.03 565 1:38 6-43 1:35 
E-yes 15 89-7 323 45.9 0-00 0.39 6355 6-91 1-61 1:54 
E-no 9 89-4 41-0 39-7 0-07 -026 42-9 5-86 1-67 1-71 
L-yes 3 24-3 27:3 30-0 — — 50-7 0-88 2:37 5-39 
L-no 6 50-5 21:7 78-7 — — 44-7 2:27 1:96 5-55 





scale score. If incorrect, the simulation merely affects a non-intended scale as there is no penalty 
for false positives. The shifts in the N scale means seen in Table 1 for IE simulation are an 
example of this. 

The E items, both E-yes and E-no, show even greater changes with IE simulation than the N 
items showed with NS simulation, the shift being almost 90 per cent. The E items, however, are 
only moderately affected by NS simulation (32 and 41 per cent for yes and no items), bearing out 
the small difference in the E means of the N and S groups in Table 1. This demonstrates yet 
again that the subjects believe that introversion implies neuroticism, but not vice versa. The 
detectability of the E items, when taken as a whole is, on average, similar to that of the N 
items. The validity analysis shows that the E items are on average valid only on the E scale. 

The L items shift relatively little under either IE or NS simulation, the largest effect being on 
the L-no items with IE simulation, a finding reflecting the high L mean for the I group in Table 1. 
The validity analysis indicates the L items are in part weak measures of E or N, but shows 
that they are, on average, good predictors of the L scale scores, L-yes and L-no items being, for 
all practical purposes, equally satisfactory. The detectability of these items is a little lower than 
the detectability of the other items. 

The full correlation matrices are shown in Tables 3 to 7, one table dealing with each type of 
item. Our main interest, of course, is in correlations involving the susceptibility of the items to 
simulation. 

Table 3 shows that with the N items there are significant correlations between susceptibility to 


Table 3. Intercorrelations for the N-yes items (key as for Table 2) 


IE NS Y FIN)  F2(E DET Æ tN tL 
IE = -0.13 015 — 036 -041* — 012 0-59** 0.33 -0-36 
NS = = -0-76*** 0-12 -040* 0:28  -044 047 -0-13 
Y = = > 0-05 0-29 -027 015 — 024 0-27 
FIN) — 2 = E -0-13 0-41* 0-32  060* — 026 
RE  — = = = = -0.02  —0-42* 0-17 0-47* 
DET = — € m Am ME -0-29 027 0-18 
tE = es 2 z = = = 0-32 -0-28 


IN = — ES = Z = E. = 0-23 


iL = i = ES = = = = 


* P«0-0. ** P< 0-01. *** P< 0-001. 
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IE simulation and both the E factor loading (F2) and the E scale validity of the items. The 
negative sign of the correlation with the E loading shows that the N items with the greater 
susceptibility to IE simulation are those with higher negative loadings on the E factor, and the 
positive correlation with E scale validity shows that such N items are more valid measures of E. 
The absence of significant correlations between change with NS simulation and the N factor 
loading and validity is probably a result of the factorial purity of the N scale items restricting the 
variance in these measures. There are, however, for these N items, two significant correlations 
with NS simulation. As can be seen in Table 3, one is that with NS simulation and the yes 
response bias of the items under control conditions, and the other is between this bias and the E 
factor loading of the items. Both correlations are negative, and the first indicates that those N 
items which have a lower base-line percentage of yes responses are more susceptible to 
simulation. Presumably as fewer people will confess to these supposed neurotic symptoms under 
the control condition, they are such extreme examples of N items that they are easy to detect 
and simulate on. 

This may seem odd, since there is no significant correlation with the N factor loadings, i.e. it 
might be expected that extreme examples of N would have higher factor loadings. This may not 
be so. Another possibility is that the homogeneity of the factor loadings does not allow the effect 
to reach significance, although in this connection it should be noted that there is sufficient 
heterogeneity to allow significant correlation with other variables. The negative correlation with 
the E loading shows that the N items with highish negative E loadings are more susceptible to 
NS simulation change, perhaps because of the common belief in a negative correlation between 
N and E. 

In the remainder of the correlation matrix there are four further significant correlation 
coefficients. The first three of these are not at all surprising: two are between the N factor 
loadings of the items and the detectability and validity of the items respectively, and a third is 
between the E factor loading and the E validity of the items (the negative sign reflecting the 
negative E factor loadings of the N items). The fourth coefficient is between the E factor loading 
of the items and their validity as L items. Inspection of the individual factor loadings and 
validities reveals that this correlation results largely from the few N items with positive E 
loadings (the majority of N items have smallish negative E loadings and a few have moderate 
positive loadings) being more valid measures of L, in that those who respond yes to these items 
have higher L scores. Thus, the denial of such extraverted N items (for example, Q2, ‘Do you 
often need understanding friends to cheer you up?’) is correlated with behaviour on the standard 
L scale items. 

Considering the E items now, Tables 4 and 5 show the intercorrelations for the E yes and E 
no items respectively. 

None of the correlations with IE or NS simulation changes are significant for the E-yes items, 


Table 4. Intercorrelations for the E-yes items 








IE NS Y Fl(N) F2(E DET tE iN tL 
IE — -043 | -024  -039 0-17 0:38 0:50 -0-46  -014 
NS — — 0-37 0.07  -018  -030 -0-26 0-45 0-24 
Y — — — —0-26 0-40 0-03 0-09 0-31 0:36 
F1 (N) n = - — 005  -033  -024 -037 0-45 
F2 (E) — — = — — 0-66** 029  -027 0:25 
DET — = = — — 0-16 0.0 | —-028 
tE = = = — — — -0:39 0-26 
tN = = = — — — — — —0-13 


tL ci = = = = = > RS = 
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and indeed, the only significant correlation in the whole table is between the E factor loading 
and the detectability of the items. This correlation is in accord with previous work (Power & 
MacRae, 1971). 


Table 5. Intercorrelations of the E-no items 








IE NS Y FIN) F2(E DET Æ tN iL 
IE = -0-32  -0-61 0-75*  -0.67 0558 0-39 0-59 0-05 
NS EN = -017 -009  -037  -019  -009  -013 0-13 
Y e = = -0-72* 0:56 -0-19  -032  -04l 0-01 
FIN  — - = = -0-75* 0:74" 066 0-62 0-03 
FE — = = - = -0-7  -058  -051 0-14 
DET = = = = = = 0-58 0-58 0-33 
tE = gs = — - - = 0-54 0-50 
N 2: "S E = = = = " 0-36 


tL = = = = ET = = zs = 





There are, however, for the E-no items, significant correlations between IE simulation change 
and both the N and E loadings of the items. As all the E loadings of the E items are negative, 
the negative correlation coefficient indicates that the higher this negative loading the greater the 
change under JE simulation. Also these items tend to have small positive N loadings, and the 
higher the N loading the greater the susceptibility of the item to IE simulation. As the N and E 
loadings are negatively correlated in these items, this correlation further represents the 
correlation seen with the E loading. 

As would be expected from their respective correlations with IE simulation change, the N and 
E loadings of these items are negatively correlated.The other two significant correlations are 
between the N loadings of the items and their initial response bias and their detectability as 
being E-no items. The negative correlation between the N loading and the tendency to say ‘yes’ 
to these items indicates that as the N content of the items increases, fewer control subjects say 
yes to the items. The positive correlation between the N loading and the detectability of the 
E-no items as being E-no items is probably partially a result of the previously mentioned belief 
in a negative correlation between extraversion and neurcticism. À moderate N loading in an 
E-no item could make the item even more obviously E-no, as for example in Q. 29, ‘Are you 
mostly quiet when you are with other people?', which has an N loading of 0-30 in addition to its 
E loading of —0-33. 

Because of the small number of L items, three L-yes and six L-no, any correlation coefficient 
must be very large (0.99 for n — 3, 0-81 for n= 6) in order to be statistically significant. It must 


Table. 6. Intercorrelations of the L-yes items 








IE NS Y DET tE IN tL 
TE — —0-21 —0-06 —0-89 0-99* —0-20 —0-99* 
NS — — 0-99* 0-63 —0-15 —0-92 0-19 
Y — — — 0-51 0-00 —0-97 0-04 
DET = = — — —0-86 —0-27 0-88 
tE — — — — — —0-26 —0 99* 
IN — — — — — — 0-22 
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also be borne in mind that such significance merely means that the correlation coefficient is 
non-zero, but that its true value can lie within wide limits. 

Bearing this caution in mind, it can be seen from Table 6 that the susceptibility of the L-yes 
items to IE simulation shows a significant positive correlation with their E validity and a 
significant negative correlation with their L validity. This seems eminently plausible. In addition 
the L-yes items have a significant positive correlation between their NS simulation change and 
the non-simulating yes response bias. Presumably, the smaller the number of non-simulators who 
would say yes to a L-yes item, the more extreme the behaviour being inquired about, and the 
less likely will a subject be to consider the item to be a measure of anything other than moral 
probity. As would be expected from the correlation between IE simulation change and the E and 
L validities of this item, the E and L validities are negatively correlated. 


Table 7. Intercorrelations of the L-no items 


IE NS Y DET tE IN iL 
IE = 0-16 -0-12 -0.82* 0-42 -0-38 0-15 
NS = = 0-49 -026 -0-15 —0-38 -0.90* 
Y = = = 0-40 —0-80 0-25 -0-33 
DET ae = = = -0-75 0-47 0-08 
(E 5 z = = = —0-60 0-20 
IN = = = = = e 0-16 





The data in Table 7 show that for the L-no items there is a negative correlation between the 
IE simulation change of the items and their detectability and also a negative correlation between 
the NS simulation change and the validity of the items on the L scale. Both of these correlations 
show that the more obviously the item is an L item or the higher its validity, the less is it 
susceptible to simulation on IE or NS. 


Conclusions 


Firstly, we have again shown how successfully subjects can simulate a variety of personalities. 
In addition we have demonstrated that the items in the EPI scales shift in response to simulation 
in the expected manner, in that N scale items change more with N and S simulation, and E scale 
items change more with I and E simulation. The E scale items appear slightly more susceptible 
to simulation. 

In some cases, the susceptibility of items to simulation is apparently straightforward, in that 
the IE shifts in the N scale items are significantly correlated with their E factor loadings and E 
validity, for the E-no items the IE change is correlated with the E factor loading, and for the 
L-yes items the IE change is correlated with their E validity. No such simple associations are 
seen for the NS simulation changes, however. 

In two instances, the simulation change is correlated with the loadings on the other factor. 
Thus, for the E-no items, the IE change is positively correlated with the N loadings, and for the 
N items the NS change is negatively correlated with the E loadings. These results probably 
represent the negative correlation between the E and N loadings of the E-no items, and also the 
belief in such a negative correlation held by subjects. 

In two instances, the non-simulating percentage of yes responses to items is correlated with 
their susceptibility to NS simulation change. The smaller the correlated percentage of yes 
responses, the more extreme the item. With the N items, this makes them more clearly N and 
more susceptible to NS simulation, while with the L yes items it presumably makes the items 
more clearly L, and therefore less liable to NS simulation. 
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Finally, the L items appear to be less susceptible to IE and NS simulation when they are more 
detectable as L items (L-no) or are more valid than L items (L-yes and L-no). 
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Effects of phase-of-respiration on GSR detection 


George M. Diekhoff 





Previous studies of visual and auditory signal detection (Flexman & Demaree, 1972; Flexman, 1974; 
Flexman, Demaree & Simpson, 1974) have found that signals occurring during exhalation are detected with 
greater sensitivity than are signals occurring during inhalation. The purpose of the present study was to make 
use of a signal detection paradigm (Green & Swets, 1966) in extending these phase-of-respiration findings to 
internal event detection, specifically, to the detection of spontaneous galvanic skin responses (GSRs). A 
secondary purpose of the study was to re-examine Stern’s (1972) finding that high magnitude GSRs were 
better detected than were low magnitude GSRs. It was found that the phase of respiration during which the 
GSR reached its peak significantly influenced GSR detection, but that unlike studies of external signal 
detection, detection was greater for GSRs peaking during inhalation than for GSRs peaking during 
exhalation. Some possible artifactual sources of this finding are discussed. No significant effect of GSR 
magnitude on GSR detection was observed although differences were in the expected direction. 


Previous studies of visual and auditory signal detection (Flexman & Demaree, 1972; Flexman, 
1974; Flexman, Demaree & Simpson, 1974) have found that signals occurring during exhalation 
are detected with significantly greater sensitivity than are signals occurring during inhalation. It 
has been suggested that this is due to increased neural noise accompanying the active phase of 
respiration. 

One purpose of the present study was to extend these phase-of-respiration findings to a 
detection task involving an internal event, the spontaneous galvanic skin response (GSR). It was 
hypothesized that if increased neural noise during inhalation exerted a detrimental effect on 
detection of external signals, the effect would be particularly strong in an internal event 
detection task and that GSRs peaking during exhalation would be better detected than would 
GSRs peaking during inhalation. A second purpose of the study was to reexamine Stern’s (1972) 
finding that subjects showed greater detection of high magnitude GSRs than of low magnitude 
GSRs. 


Method 
Subjects 


Sixteen students were drawn from introductory psychology classes at Texas Christian University to serve as 
subjects in the present experiment. 


Apparatus 


All experimental sessions were conducted with subjects seated in a sound-attenuated room facing a feedback 
display panel. Physiological measures were recorded using a Narco Bio-Systems Physiograph Six. Skin 
resistance was recorded using a Narco Bio—Systems GSR preamplifier with AC coupling and a time constant 
of 5 sec. Monopolar Ag/AgCl electrode placement was used to record skin resistance from the left arm, with 
the active electrode located on the middle segment of the middle finger and the reference electrode located 
approximately 8 cm above the wrist. Electrode current density was maintained at 10 uA/cm*. Respiration was 
recorded using bellows pneumography and impedance pneumography, recorded from Ag/AgCl electrodes 
attached to the chest under each arm. 

A respond signal (RS) circuit was manually activated by the experimenter causing a pilot lamp on the 
feedback display panel to light 1-5 sec following activation of the circuit. Events were marked on the 
respiration records both at the time of RS circuit activation and at the time of RS onset. 

Four response buttons were located on the right arm of the subjects' chair, each representing a degree of 
certainty as to whether or not a GSR had occurred. These certainties ranged from ‘certain no GSR occurred’ 
to ‘certain a GSR did occur’. 
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Feedback was manually presented by the experimenter. Pilot lamps located on the feedback display panel 
were used to indicate whether or not a GSR had occurred and the relative magnitude of those GSRs that did 
occur (‘low’ magnitude = 100-500 ohms; ‘high’ magnitude = over 500 ohms). Subjects’ responses and GSR 
magnitudes were recorded by the experimenter. 


Procedure 


Each subject served in three, one-hour sessions over a course of three weeks, during which time as many 
trials as possible (up to 110 per session) were obtained from each subject. Approximately two-thirds of the 
trials during each session were GSR trials, with at least 10 high-magnitude and 20 low-magnitude GSRs, and 
the remaining trials were no-GSR trials. 

Subjects were instructed that occasionally the RS would light and that they should indicate, using one of 
the four response buttons available, their relative certainty as to whether or not a GSR had occurred. 
Subjects were informed that on approximately two-thirds of the trials, the RS circuit would be activated as 
soon as the GSR reached its peak and that on the remaining trials, the RS circuit would be activated during a 
period of no-GSR activity. On these no-GSR trials, the RS circuit was activated following at least 8 sec 
during which no GSR-like decreases in skin resistance were observed. The delayed-RS procedure avoided 
confounding effects of the phase of respiration during which the GSR peaked with the phase of respiration 
during which the RS occurred. 

Although the complete sequence of trial types (GSR or no-GSR) was not randomly prearranged, an effort 
was made to avoid any patterning of trial types, and subjects were informed that trial types were randomly 
arranged. In addition, a post-experimental questionnaire revealed little conscious use of patterning of trial 
types as a response mediator, and GSR detection scores of subjects who indicated having used this cue were 
not noticeably higher than scores obtained by other subjects. 

Subjects were instructed to attend to unspecified physical sensations in making their responses. They were 
not told of the variety of GSR mediators, e.g. emotional thoughts, respiratory irregularities, external stimuli, 
etc. It should be noted that responses to the post-experimental questionnaire revealed little conscious use of 
mental events as a response mediator. In addition, the extent to which muscular activity (appearing as 
artifact on the respiration and skin resistance records) and alterations in respiratory activity directly influenced 
scores was minimal since the experimenter avoided initiating trials during which these cues would have been 
available during the experiment, and since such trials were discarded during the process of data scoring. 
Finally, no significant relationship was found between inter-trial interval (ITT) and subjects’ responses, 
indicating that ITI was probably not used in mediating responding to a significant extent. 

Feedback was delivered immediately following each response. Two types of feedback were used in the 
present study: magnitude feedback, in which accurate information concerning the presence or absence and 
the relative magnitude of GSRs was provided; and, correctness feedback, in which subjects were told only 
whether or not a GSR had just occurred. As no differences in GSR detection performance were found 
between these two groups, their data were combined. In addition, as neither group showed significant 
changes in GSR detection as a function of sessions, all results reported here will be collapsed on this factor 
(Diekhoff, 1976). 


Dependent measures 


Four receiver operating characteristic (ROC) curves were formed for each subject according to the rating 
method (Green & Swets, 1966). These ROC curves were based on performance during all three sessions and 
enabled evaluation of subjects' ability to detect low- and high-magnitude GSRs peaking during inhalation and 
exhalation. Exhalation was defined as the portion of the respiratory cycle from the peak of inhalation of one 
cycle to the beginning of inhalation of the next cycle, and thus included the pause between exhalation and 
inhalation. In forming ROC curves, GSRs were assigned to the phase of respiration during which they 
reached their peaks. No-GSR trials were assigned to the phase of respiration during which the RS circuit was 
activated. An approximately equal number of no-GSR trials were used as blank trials for evaluation of 
detection of low- and high-magnitude GSRs. 

The dependent measure of GSR detection consisted of the absolute difference between the areas above 
ard below each ROC curve and reflected a subject's ability to differentiate between GSR and no-GSR trials. 
The particular labelling strategy used by a subject, accurate or reversed, did not affect this detection score 
since the consistent use of either strategy indicates an ability to detect GSRs. Scores on the GSR detection 
measure could vary between 0 and 100, with scores increasing as a function of increasing sensitivity of GSR 
detection. All scores reported below refer to sensitivity of GSR detection. 
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Results 


The effects of GSR magnitude and phase-of-respiration on GSR detection were evaluated using a 
two-factor randomized block Anova. No significant difference was found between the detection 
of low (M — 23-4) and high magnitude GSRs (M = 24-2; F=0-069, d.f. = 1, 45), nor was the 
interaction between GSR magnitude and phase-of-respiration found to be significant (F = 0-002, 
d.f. — 1, 45). Phase-of-respiration was found, however, to have a significant main effect on GSR 
detection (F = 4-986, d.f. = 1, 45, P< 0-05). GSRs peaking during inhalation were detected with 
significantly greater sensitivity (M — 27-2) than were GSRs peaking during exhalation (M = 20-4). 
Failure of GSR magnitude to influence GSR detection scores to a significant extent insures that 
the phase-of-respiration main effect was not due to a confounding of phase-of-respiration and 
GSR magnitude, i.e. the phase-of-respiration effect was not an artifact of increased GSR 
magnitude during inhalation and decreased GSR magnitude during exhalation. Failure of the 
phase-of-respiration x GSR magnitude interaction to reach significance indicates that the 
phase-of-respiration effect on GSR detection occurred for both low and high magnitude GSRs. 

It might be argued that the phase of respiration during which the RS occurred, rather than the 
phase of respiration during which the GSR peaked, influenced GSR detection performance. In 
order to evaluate this possible artifact, two ROC curves were formed for each subject reflecting 
performance on trials during which the RS occurred during inhalation and trials during which the 
RS occurred during exhalation. A t test for related measures was used to evaluate any difference 
in detection performance between inhalation-RS and exhalation-RS trials. This test did not 
approach significance, indicating that the phase of respiration during which the RS occurred was 
not the source of the significant effect of phase-of-respiration on GSR detection. 

Discussion 

It was found in the present study that the phase of respiration during which spontaneous GSRs 
reached their peaks had a significant effect on the forced-choice detection of those GSRs. 
Several possible artifactual sources of this effect were examined and eliminated. Unlike previous 
studies which examined the effects of phase-of-respiration on detection of external signals and 
found better detection during exhalation than during inhalation, the present study found that 
GSRs peaking during inhalation were better detected than were GSRs peaking during exhalation. 

Further studies of the effect of phase-of-respiration on GSR detection should examine several 
potential sources of the discrepancy between the present study and previous phase-of-respiration 
studies. One such possibility is that near-threshold external stimuli occurring during inhalation 
frequently produce ‘spontaneous’ GSRs peaking during exhalation and that the external stimuli 
occurring during exhalation frequently produce GSRs peaking during inhalation. If this were 
found to be true, it could be argued that the subjects in the present study were basing their 
responses on the occurrence of external stimuli which produced GSRs as a component of the 
orienting response rather than on the occurrence of the GSRs themselves. External stimuli 
occurring during inhalation would be expected to be less readily detected than would external 
stimuli occurring during exhalation. In the present study, this situation would have resulted in 
the apparent superior detection of GSRs peaking during inhalation and inferior detection of 
GSRs peaking during exhalation. 

A second alternative interpretation that might be explored is that a portion of the GSR other 
than the peak is used as the ‘signal’ in the detection of GSRs and that this portion frequently 
occurs during inhalation when the peak of the GSR occurs during exhalation, and occurs during 
exhalation when the peak of the GSR occurs during inhalation. 

Finally, it may be found that whereas external stimuli are better detected during exhalation, 
internal stimuli are better detected during inhalation. Such a conclusion would necessitate 
revision of the neural noise explanation of phase-of-respiration effects on signal detection. 
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In conclusion, while discrepant with previous phase-of-respiration findings, the results of the 
present study again point to phase-of-respiration as an important source of intrasubject 


variability in signal detection performance. 
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Cross-modality transfer of spatial information 


. Harold D. Fishbein, Joanne Decker and Peggy Wilcox 





First, second, and fourth grade boys and girls were tested on a spatial task requiring perception of the 
location of a group of three geometrical objects. The initial sensory input was either visual or 
tactual/kinaesthetic, and the choice stimuli, which were presented either simultaneously with the objects, or 
after the objects were removed (successively) were photographs of different configurations of the objects. 
There was no performance difference between the intramodal and cross-modal conditions, although older 

. Children performed better than younger ones, and performance was better under the simultaneous than 
successive conditions. It was concluded that in making complex visual spatial judgements, visual perceptual 
. representations mediate performance under both tactual/kinaesthetic and visual sensory inputs. 





The present paper addresses itself to answering three related questions concerning visual-visual 
intramodal matching and tactual/kinaesthetic (TK) - visual cross-modal matching of complex 
spatial information. By ‘complex’ we mean information about spatial relationships between 
objects in the external environment. The first question is, how adequately can complex spatial 
information received through the TK senses be transformed into a visual format and stored 
therein? Recent research by Attneave & Benson (1969), Connolly & Jones (1970) and Pick (1970) 
involving simple spatial information strongly suggests that TK information is readily transformed 
and stored in a visual format, e.g. in the Connolly & Jones study visual-visual matching and 
kinaesthetic-visual matching were nearly equivalent. The first question, then, is concerned with 
the generality of their findings. 

The second question is, does the transformation and storage of TK sensory information into a 
visual format occur concurrently with (or prior to) TK storage or does it occur after TK storage? 
If the transformation and storage into a visual format occurs after TK storage, then the typical 
TK memory decrement relative to visual memory following experimenter-imposed response 
delays (e.g. simultaneous versus successive presentation) should be observed (Posner, 1967; 
Goodnow, 1971; Friedes, 1974). However, if as Jones & Connolly (1970) found that the 
transformation occurs prior to or concurrently with TK storage, then the effects of simultaneous 
versus successive stimulus presentations should be equivalent under visual-visual and TK-visual 
conditions. 

The third question has two parts — do either the adequacy of or the time parameters of the 
transformation and storage of TK spatial information into a visual format vary with 
developmental level? Regarding the first part, many studies dealing with comparisons of 
intramodal and cross-modal matching of shape information have found increased cross-modal 
matching with increasing age, e.g. Goodnow (1971), Jackson (1973) and Jones & Robinson 
(1973). However, in their simple spatial task, Connolly & Jones found no evidence of increased 
kinaesthetic-visual matching relative to visual-visual matching in the age range five years to 
adulthood. This latter finding implies that by five years of age, for spatial tasks, children can 
readily utilize a visual format for kinaesthetic information. Do young children have this capacity 
for complex spatial information? 

Regarding the time parameters of the transformation, do younger relative to older children 
transform TK information into a visual format upon being presented the visual comparison 
stimuli or do they perform the transformation concurrently with receiving TK information? If 
the former is the case, then younger children should show a relatively greater performance 
decrement than older children under the ‘successive’ TK—visual condition. However, if the latter 
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is the case, then relative performance decrements in that condition should be equivalent for. _ 
younger and older children. Research bearing on this issue is conflicting. Jones & Robinson” e 
(1973) for shape perception, and Smothergill, Hughes, Timmons & Hutko (1975) for spatial 
location found no systematic age by response delay effects, but Smothergill (1973) for a different 
spatial location task did find relatively greater performance decrements for younger children with 
increasing response delay intervals. 

Answers to these questions were sought by testing children of different ages on a complex 
space perception task which has been found to be sensitive to the age differences employed in 
the present study (Nigl & Fishbein, 1974). The Nigl & Fishbein task was extended by varying 
initial stimulus-input (visual versus TK) and by varying time of presentation of the comparison 
stimuli (simultaneous versus successive). 

Finally, it should be noted that the present research has no direct bearing on the question as to 
whether the visual mode plays the dominant role by sighted people in the development of their 
spatial behaviour. The issue of the ‘primacy of the visual system’ in spatial abilities has recently 
been discussed by Jones (1975), who concludes that the research data do not support such a 
view. 


Method 
Subjects and design 


The subjects were 48 white boys and 48 white girls from a middle-class Cincinnati, Ohio public school. They 
were divided into three groups of 16 boys and 16 girls on the basis of age: first grade (mean age of 6 years, 6 
months), second grade (mean age of 7 years, 7 months), and fourth grade (mean age of 9 years, 8 months). 
The children were tested in a 3x2x2x2 factorial design. The between-subjects factors were: Age and 

method of object presentation (visually or tactually/kinaesthetically). The within-subjects factors were: type 
of photographs used (top-down perspective of the objects vs. frontal or straight-ahead perspectives); and 
method of presenting the photographs (simultaneous with the presentation of the object vs. an approximately 
5 sec delay after presenting the objects). i 


Apparatus 


The inspection stimuli consisted of three wooden objects painted gray: one 12" x 14" cube; one cylinder 136" 
long, with a radius of 4"; and one three-sided pyramid 14" high These objects were displayed near the 
centre of an 8'x 10" white peg-board, such that two of the objects were parallel to the horizontal sides of the 
peg-board, and two were parallel to the vertical sides. 

The response stimuli consisted of 32 sets of four 8"x 10" black-and-white photographs of the geometric 
objects taken from either a top-down or frontal perspective. Each set contained one veridical photograph, 
one photograph showing the array rotated 180 degrees, one photograph showing à left-nght reflexion of the 
array, and one photograph showing a back-front reflexion of the arzay The four photographs were randomly 
mounted on cardboard to form a horizontal line. 

A cardboard box, 10* high, 12" wide, and 18" long, shielded the display from visual inspection for subjects 
in the TK condition. An opening, 5" x6", was cut into the front panel of the box to accommodate the child's 
hand. Small openings were cut in the front and top panels to serve as windows for a small plastic doll to ‘see’ 
through. 

A plastic doll, 4" high, was used to direct the subject's attention to the proper perspective. 


Procedure 


Each child was taken by the white female experimenter to a testing room in his school, and seen 
individually. The child was seated at a table upon which the three geometric objects were placed. In order to 
familiarize the child with the materials, each child was asked to holc the objects and to provide a label for 
each of them (correctness of label was not required). The child was also shown a photograph correctly 
depicting the inspection display. 

For children tested in the visual condition, the plastic doll was plazed either directly i in front of the 
displayed objects (frontal perspective) or suspended on a wire directly above the objects (te down 
perspective). The children themselves viewed the array from an oblique angle. They we ah C»: ‘Tm 
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going to putthese blocks in different places on the board. Then I'm going to show you four pictures. What I 
Want Jegu to do is point to the picture that looks like what the doll can see from where he is.” 

For children tested in the TK condition, the plastic doll was placed either at the small opening in the front 
panel ‘of the cardboard box (frontal perspective) or at the opening in the top panel of the box (top-down 
perspective). They were then given instructions identical to that given to the visual group with the addition of 
the following sentence: ‘The way you’re going to figure out what the doll sees is by feeling the blocks with 
your hand.’ 

In the situation where the photographs were presented simultaneously with the stimulus array, the child 
made his choice from the set of four photographs while visually (or TK) inspecting the array. This task was 
subject-paced. 

In the situation where presentation of the photographs was delayed, the child was permitted to visually (or 
TK) inspect the stimulus array until he indicated he was ready to view the photographs and make a choice. 
The array was then removed, and the photographs were displayed approximately 5 sec later. 

Each child received 16 trials under one of four presentation orders. Each order involved the sequence of 
blocks of four trials, e.g. trials 1-4, top-down photographs, simultaneous presentation; trials 5-8, top-down 
photographs, delayed presentation; trials 9-12, frontal photographs, simultaneous presentation; trials 13-16, 
frontal photographs, delayed presentation. The order of presentation was counterbalanced across subjects. 
For each of the 16 trials, a different spatial arrangement of the geometric objects was used, with corrective 
feedback given after each trial. An experimental session lasted approximately 30 min. 


Results and discussion 


In Table 1 is presented mean number of errors and standard deviations, out of four trials, for all 
the major conditions of the experiment, averaged for boys and girls. As can be seen, there is a 
steady decrease in errors with increasing age (F= 21-58, PX 0-01), and fewer errors were made 
pam the simultaneous conditions than under the delayed conditions (F= 24-31, P< 0-01). There 


Table 1. Mean number of errors as a function of age, initial sensory input, type of photograph, 
wand simultaneous versus delay presentation of photographs, maximum score equals four 








Grade 1 Grade 2 Grade 4 
Type of —————— —————— 
photograph TK Visual TK Visual TK Visual 
Simultaneous 
presentation 
Top-down X 1-19 1-56 081 . 1-19 0-50 0-44 
S.D. 1-22 1-44 0-70 1-56 0-63 0-73 
Frontal X 1-75 1-19 0-75 0-94 0-38 0-31 
S.D. + 1-44 0-98 0-96 1-00 0-50 0-60 
Delay f 
presentation M 
Top-down X 2-56 2.13 1:31 1-75 0-50 0-81 
S.D. 1-09 0-96 1-00 1-11 0-73 1-15 
Frontal X 1-50 1-75 1-38 1-69 0-69 1-00 
S.D. 1-09 1-06 0-96 1-01 0-70 0-96 





were no statistically significant performance differences as a function of sex of subject, type of 
photograph employed, or method of initial sensory input, i.e. TK versus visual (all Ps > 0-10). 
Regarding the latter comparison, children tested under the cross-modality transfer procedures 
averaged 1-11 errors (out of four trials) and children tested under the intramodality transfer 
conditions averaged 1-23 errors. None of the two-way interactions approached statistical 
„significance zx; > 0-10) and only one of the three-way interactions was statistically significant 
-age by type or photograph by simultaneous versus delay presentation (F — 4-30, P « 0-05). This 
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interaction is primarily attributable to the relatively inferior performance of the first graders 
under the top-down, delay conditions. 

In Table 2 is presented the total number of errors made by each age group under each 
experimental condition, to each of the non-veridical photographs. The data in this table answer 
the question, ‘when children make an error, what kind is it?’ For example, when second grade 
children were tested under the TK top-down delay condition and made an error, they pointed to 


Table 2. Total number of errors to each non-veridical photograph as a function of age, initial 
sensory input, type of photograph and simultaneous versus delay presentation of photographs 





Top-down photographs Frontal photographs 





Simultaneous - Delay Simultaneous Delay 


LR BF 180° LR BF 18? LR BF 180 LR BF 18 





First grade 
TK 8 8 3 15 17 9 14 11 3 9 12 3 
Visual 13 4 8 15 7 n 8 5 6 11 10 7 
Second grade 
TK 3 8 2 4 12 5 3 8 1 2 15 4 
Visual 5 ll 3 8 12 8 5 10 0 11 13 3 
Fourth grade 
TK 5 1 2 2 3 3 0 4 2 2 7 2 
Visual 5 1 1 7 5 2 2 2 1 1 13 2 


\ 





\ 
the back-front (BF) photograph 12 times, the 180° photograph five times, and the left-right (LR). 
photograph four times. These data provide information regarding whether the processes li 
underlying visual spatial recognition are the same or different under TK and visual sensory 
inputs. Using several Bernoulli trials analyses and excluding cases in which ties occurred, the 
following conclusions can be drawn. First, under neither the TK nor visual conditions was the 
180° photograph the most frequent error (P « 0-01). Second, under each of the 12 conditions 
depicted, e.g. (1) first grade, top-down, simultaneous; (2) first grade, top down, delayed, etc., the 
number of times the most frequent error was the same for the TK and visual conditions 
exceeded chance expectations (P< 0-01). And finally, under the TK conditions, the most 
frequent error was the back-front photograph (P « 0-01), whereas under the visual conditions, 
the left-right error tended to be the most frequent (P< 0-05). 

How do these results bear on the three questions previously raised? First, consistent with the 
research of Attneave & Benson (1969), Connolly & Jones (1970), and Pick (1970), for complex 
spatial information TK stimulation can be readily transformed and stored in a visual format. 
Moreover, these results indicate that under both visual and TK inputs, the visual format 
mediated performance. This was seen in the nearly identical overall correct performance under 
intramodal and cross-modal conditions, in the lack of a significant interaction between mode of 
initial input and simultaneous versus delay presentation on performance, and in the highly similar 
types of errors made under the intramodal and cross-modal conditions. The one apparent 
exception to their equivalence is that under TK sensory inputs, children were more likely to 
confuse ‘front’ and ‘back’ than ‘right’ and ‘left’, whereas the opposite tended to be the case 
under visual sensory inputs. However, this ‘exception’ may be attributable to differences in the 
extraction of sensory information and not to differences in mediation of performance, a point 
which Goodnow (1971) has strongly emphasized. A further test of the latter hypothesis could be . 
made by varying initial sensory input, but using the TK modalities for the comparison stimuli. A 
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bonus with this procedure would be a determination of how well the TK system can mediate 
complex spatial behaviour. 

Second, consistent with the Jones & Connolly (1970) results with college students for their 
simple spatial task, the present data support the view that the transformation and storage of TK 
information to a visual format occurs prior to or concurrent with TK storage. This conclusion is 
based on the findings that under both intramodal and cross-modal conditions performance was 
better under the simultaneous than successive conditions, and moreover, that there was no : 
interaction between these sets of conditions on performance. Given the above conclusion, the 
question can be raised as to why in shape perception tasks (as summarized by Goodnow, 1971 
and Friedes, 1974) subjects apparently do not transform TK information into a visual format 
during TK encoding? That is, in some of these studies (e.g. Rose, Blank & Bridger, 1972) 
TK-visual matching is equivalent to visual-visual matching under simultaneous conditions, but 
inferior under successive conditions, implying that the transformation of TK information to a 
visual format occurs when the visual comparison stimuli are presented. As we have just seen, 
this is a different pattern of results from that found in spatial tasks. 

There are at least two possible explanations for this dilemma. First, as Friedes (1974) argues, 
in spatial tasks the visual format is probably the most ‘adept’ modality for processing and 
storing information, whereas, this may not be the case for shape perception. Hence, in shape 
perception tasks, subjects may need the assistance of visual cues, as found in the visual 
comparison stimuli to transform TK shape information into a visual format. In the successive 
conditions, then, the TK information is initially stored in a TK format where it decays until the 
visual stimuli are presented. The second explanation is based on the ‘cue-dependent forgetting’ 
hypothesis of Tulving (1974). The basic argument here is that in shape perception, TK 
information may be transformed into a visual format during the TK encoding process, but under 
*successive' TK-visual conditions too many of the cues of the acquisition environment are 
absent for effective retrieval to occur. This implies, of course, that the problem is not one of TK 
memory, per se, but rather of differences in the retrieval environment in the intramodal and 
cross-modal conditions. There is no obvious basis for preferring one explanation over the other. 

Finally, the nearly equivalent performance, at each age level, under the intramodal and 
cross-modal conditions, strongly indicates that by 6 years of age, for complex spatial tasks the 
visual format is readily used to mediate performance. Moreover, the absence of an interaction 
between either age and simultaneous versus successive presentation, or initial sensory input and 
simultaneous versus successive presentation, strongly indicate that the time parameters involved 
with the transformation and storage of TK information to a visual format are stable during the 
developmental period of 644 to 9% years. This point is important to emphasize because other 
spatial developmental factors are changing during this age range, as manifested by the marked 
performance improvement seen therein. What might these changes be? Nigl & Fishbein (1974) 
suggest that one such factor is the growing ability to extract relevant spatial information from 
photographic representations. The present study sheds no light on this. Another factor which 
changes is the ability to distinguish mirror images from veridical photographs. Examination of 
Table 2 indicates that second graders relative to first graders show a decrease in left-right errors, 
but not in back-front errors, and that fourth graders relative to second graders show a 
substantial decrease in back-front errors but only a minimal decrease in left-right errors. It is 
not clear why back-front errors should be more intransigent than left-right errors, but an 
attempted explanation would take us beyond the scope of this paper. 

, In sum, it can be concluded that in making complex visual spatial judgements, visual 
representations mediate performance under both TK and visual sensory inputs; that the 
transformation and storage of TK information to a visual format occurs prior to or concurrent 
with TK information storage; and that the processes underlying the transformation and storage 
of TK information to a visual format are stable during the age range 6^ to 97^ years. 
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Book reviews 


Prison Revolt. By Mike Fitzgerald. Harmondsworth: Penguin. 1977. Pp. 278. £1.00. 


The essential theme of this book is summed up in the closing paragraphs and in particular in the last 
sentence: ‘For the prison indeed oppresses not just those who are locked behind its bars; increasingly it 
reaches out to ensnare every one of us.’ This theme, with its implications of penal policy as an aspect of the 
class struggle and its continual disparagement of the work of middle-class liberal reformers, runs right 
through the chapters on the functions of imprisonment, on British penal policy, on men in prison (not, as 
might possibly be inferred from the title, an analysis of the composition of the prison population, but a 
selection of comments by prisoners on themselves and their reactions to the experience of imprisonment and 
comments on various aspects of prison regimes), and on the development of prisoners’ movements in the 
USA and Britain. 

On the evidence presented one would suspect the author himself sees the prospect of any significant short- 
or even medium-term improvement in prison conditions as being extremely bleak, for all his enthusiastic 
advocacy of the politicization of the processes of prisoners’ protest. The lack of any significant Trade Union 
or working-class commitment to the interests of prisoners; the absence of anything but a numerically small 
commitment to P.R.O.P from outside (even on the part of ex-prisoners); the difficulty of ‘concretizing’ 
support within the prisons; the preoccupation with immediacies rather than longer term strategies; and the 
internal dissensions within the movement itself are variously cited as contributory factors so far as this 
country is concerned, but there is virtually no discussion as to the reasons for this. The chapter on the 
development of Prisoners’ Unions in America asserts that it ‘has many lessons for this country’, but again 
there is no detailed discussion of the basis for the assertion founded, as it would need to be, on some 
comparative study of differences in the composition of prison populations and perhaps of public attitudes, 
which are touched on but not elaborated. Among these ‘lessons’ one may note that, while formalized 
constitutional and legal safeguards do not necessarily lead to substantive change for a variety of reasons 
which ultimately relate to the fact that there is no substantial body of outside support concerning itself with 
their enforcement, there is no clear evidence forthcoming that the ‘more radical, more organized political 
actions against the coercive system which have emerged, and to which the legal battle is largely 
subordinated’ are producing significantly better substantive results. 

So, for all the radical rhetoric, what ultimately emerges in concrete terms as one reads the account of the 
current activities of the London end of P.R.O.P.; the conclusions, drawn from the Scandinavian experience, 
which is not however discussed in any detail, on the limitations on the scope for useful action within the 
prison if the inherent dangers to prisoners are to be avoided; and the emphasis on the need to focus on 
specific issues rather than on general goals? Basically a recognition of the need for long-term development, 
through the medium of a prison-experience base, of a wider understanding and consequent concern at 
working-class level for the interests of prisoners. This would seem to parallel and complement rather than to 
supplant what the author would regard as the misguided efforts of ‘liberal reformers’ who know from long 
experience just how difficult the task is at all levels of society. If the author is not already aware of it, as 
one suspects he is, experience of the working-class reaction in Hull to the Hull Prison riot with its support 
for the prison officers as reflected in the local press and its antagonism to the prisoners, might have helped to 
convince him. Here indeed one might say with Vergil 'facilis descensus Averno, sed revocare gradum/ 
superasque evadere ad auras, hic labor, hoc opus est’. 

If the riot brought out the worst of the almost paranoid wish of the Home Office for concealment, on 
which Fitzgerald comments in other contexts, the descent to chaos (stimulated perhaps by the radical 
pressures from without?) and the backlash to it have certainly underlined the problems of establishing the 
greater identification, which he seeks, of a very heterogeneous prison population with an outside community 
which takes a jaundiced view of prisoners' claims to 'rights' and gives a low prionty to expenditure on them. 
‘If more money is behind this crap’, says one of his prisoners of a prison officers’ go slow, ‘what do you 
think of people who use a helpless section of this community as purchasing power?' Exactly perhaps what 
you think of people who picket hospitals to prevent the supplies getting through. There's a moral here 
somewhere, and perhaps it's that the sheer size of the task of wresting improvements out of a public, which 
is as unwilling as Fitzgerald suggests its capitalist masters are, leaves little room to waste time in dispute on 
theoretical argument. 

R. W. DRINKWATER 
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Scientist as Subject: The Psychological Imperative. By M. J. Mahoney. Cambridge, Mass.: 1976. Pp. 249. 
£8.80. 


The dust is now, perhaps, beginning to settle and the import of the newer philosophies of science for 
psychology becoming more clear; science begins to look more psychological every day. 

In this book Mahoney argues for the scientific investigation of what he claims is an as yet almost unstudied 
phenomenon; the psychology of the scientist himself. Psychologists must complete the circle and investigate 
the investigators, produce a psychology of psychology, for, he argues, scientists may currently be 
completely misled by an maccurate image of themselves. To the extent that they see themselves as rational 
and unbiased seekers after ‘the truth’ when they are not and cannot be, they will be inappropriately 
complacent, overly confident, and tenaciously committed to the confirmation of their theories and hypotheses 
quite in vain. A more realistic view is required, for, to quote what is clearly one of Mahoney’s favourite 
maxims, ‘He who is unaware of his ignorance will be misled by his own knowledge’ (Whately). It is, following 
Popper and W. W. Bartley, a comprehensively critical rationalism (CCR) that Mahoney proposes; scientists 
qua scientists should hold all their beliefs open to criticism rather than continually seeking to justify them — a 
very demanding philosophy indeed. 

Mahoney begins his book by describing what Robert Merton has called the ‘storybook image’ of scientists: 
the view that they are intelligent, always rational, logical, experimental, objective, humble, communal, and 
cautious. But then, in the course of discussing the everyday realities of graduate admission and training 
procedures (in the States), academic politics, journal publication policies, personal feuds between scientists 
(as frequent in history as now), and honesty in data gathering and reporting (finding like St James-Roberts in 
New Scientist, Nov. 1976, that data are not infrequently ‘groomed’ or ‘massaged’ into fitness for their task), 
he explodes that fairy tale dream entirely. For as far as he can tell, scientists are not exceptionally 
intelligent, they are often irrational and dogmatic, they have no great grasp of logic (for besides 'affirming 
the consequent’ all over the place, they have little understanding of the logic of falsification as Wason, to 
whom he refers, has repeatedly shown, they are more argumentative than experiental, are incautious, 
arrogant, secretive, competitive, and often emotional; in short, they are human and thus, most importantly, 
fallible — their accounts are always, always open to criticism. 

Mahoney does not say, however, whether such a fine list of human failings gains the scientist a 
super-attractive spouse or not - to reverse completely the image present in the minds of Liam Hudson's 
schoolboys. The wonder is, ın the face of all these defects, that science works at all. A fact which says 
something about how the intention embodied in a social institution has little to do with the personal motives 
of the individuals functioning within it. Like mediaeval alchemists or theologians, they may be passionately 
interested only in achieving priority and having their findings publicly registered (e.g. see J. D. Watson The 
Double Helix), but as long as the rules of the science ‘game’ are from time to time observed, scientists’ own 
reasons for their actions are irrelevant — science as a social instrument works! But, Mahoney argues, its 
efficiency may be impaired; there are costs — cognitive, behavioural, and emotional ones. We may not realize 
that all our systems of scientific belief oversimplify and distort reality; we may thus act inappropriately on 
the basis of a blind belief in their accuracy; and we may be led to premature closure and an intolerance of 
other belief systems, while still remaining insecure ourselves. 

Thus the scientist's image of himself as a Paragon of Inquiry may be detrimental if it leads him (and 
others) to attach a certainty to his ‘knowledge’ that it just does not have. Science, Mahoney argues (following 
Kuhn, Popper and Bartley), is not to do with accumulating neutral facts that will, one day, produce a 
succinct explanatory theory, but with the never-ending revision of beliefs. The aim cannot be to produce 
guaranteed truth, but simply to reduce error. While some beliefs may be true even if not scientific, or untrue 
even if scientific, scientific beliefs are ones which are always open to revision, science having nothing to do 
with those beliefs intrinsically closed to such a process. Thus, Mahoney claims, one's rationality should be 
based upon criticism rather than justification; and he outlines (p. 146) the contrasts between the conventional 
(justificational) and the more recent (non-justificational) views of science. Most difficult to deal with in the 
non-justificational perspective is the claim that there is no ultimate epistemological authority to which to 
appeal for the truth of one's beliefs — sense data, logic, revelation, etc. — the authority of that authority is 
always a dogmatic irrational commitment of faith unless held open to question; hence the CCR of Bartley of 
which Mahoney approves. 

All this means that scientists are in the business of producing, not ‘the truth’ but persuasive versions of 
events (the features that such accounts should have if we are to allow them to persuade us, being a matter 
for continual critical psychological study). Thus in contrast to Popper and others who view scientific 
revelations as totally explicable by philosophical analyses, Mahoney claims, following W. B. Weimar's 
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doctrine of psychological fallibilism, that the growth of science is a social psychological matter, and the 
reasons why people accept or reject the various theories or paradigms offered them is a matter for 
psychologists to study, empirically (a task in Harré’s ethogenics or Garfinkel’s ethnomethodology) — but by 
what means, in terms of what beliefs? That is the rub. In this book Mahoney spends a great deal of time 
documenting the horrors of contemporary scientific life, but little on what is the task if his views are to 
prevail: the working through of all the implications for psychological practice within the perspective of the 
non-justificational metaphysics he proposes. 

JOHN SHOTTER 


Behaviourism and the Limits of Scientific Method. By B. D. Mackenzie. London: Routledge and Kegan Paul. 
1977. Pp. 193. £4.95. 


The author alleges that for 50 years psychologists, in the dress of Behaviourism of one kind or another, 
hoped to make their study a science by attempting to combine into one two separate rules, namely the 
elimination of consciousness as psychology’s subject matter and the rigorous use of scientific method. This 
book is an examination of this hope and these rules. The author admits that such an examination requires 
both a history of the origin and death of such a hope in psychology and a philosophical analysis of the 
necessity of its failure. Hence, his fear is that there may be too much philosophy for the psychologist and 
too much psychology for the philosopher. My own impression, however, is the opposite, namely that the 
psychology is too well known to the psychologist to interest him and the philosophy too elementary for the 
philosopher to interest him. 

In line with his general aim, the author, rightly, concentrates in his historical account on the methodology 
rather than the results of behaviourism. I do not, however, see the wisdom of identifying the behaviourist 
methodological approach as ‘positivism’ and its results and contents as ‘realist’. 

Though I am not an expert on the history of psychology, Mackenzie seems to me to make a plausible case 
for his thesis that the elimination of consciousness either from consideration or from existence sprang from 
the supposed inability to cope with this notion in animal behaviour research. Much contemporary 
philosophy, however - on which Mackenzie is strangely and wrongly silent ~ does make at least a plausible 
attempt to understand much of human consciousness in a logically behaviourist way quite independently of 
any problem posed by animals, and to examine more carefully how far or how little ‘mental concepts’ can 
be dismissed or set aside as referring to unobservables. Nor am I at all convinced by his thesis that 
methodological rules cannot be satisfactorily applied to many of the kinds of problems that arise in scientific 
research. This reason for his thesis, namely that the solution of such problems requires faith, luck and 
hunches confuses the genetics of such solutions with their logic. Furthermore, his conclusion that 
behaviourism’s positive contribution to psychology is a practical demonstration of its own untenability is far 
too pessimistic. 

Mackenzie’s book is, as far as I can judge, well documented and well argued as a slice of the history of 
psychology and rightly stresses that philosophically the mainstay of behaviourism was its insistence on strict 
procedures for deciding the truth or falsity of hypotheses, theories and conjectures, especially on the method 
of verification. Unfortunately, however, his knowledge of critical work done on verificationism seems to be 
second hand and takes over too unquestionably such objections as, for example, that the criterion of 
verifiability is itself unverifiable. Furthermore, he seems to suppose that philosophical analyses of concepts 
are or should be largely ‘reconstructions’, that is, aimed at examining the vices and virtues of the concepts 
out of which scientific theories are built. Hence, much of his supposed philosophical examinations are really 
those which psychologists rightly make of rival psychologists’ theories. 

ALAN WHITE 


Psychologists on Psychology. By David Cohen. London: Routledge and Kegan Paul. 1977. Pp. 360. £6.95. 


As a survey of the present state and shortcomings of psychology, and an inquiry into what makes 
psychologists themselves tick, David Cohen’s book cannot be taken very seriously. It is based on interviews 
with a number of mostly well-known contemporary figures, to which Cohen adds his own comments and 
conclusions. However the sample of thirteen psychologists, or quasi-psychologists (for the sample includes 
Chomsky, Jouvet, Laing, Leupold-Lówenthal, a psychoanalyst, and Tinbergen) is small and 
unrepresentative, and the questions put (apart from ‘Why did you become a psychologist?’, and ‘Were you 
reacting against a religious upbringing?’) were tailored to each individual. Any general conclusions derived 
are, therefore, necessarily insecure, and one wonders whether they ought to have been offered even with ‘all 
due reservations’. The conclusions which Cohen makes most of is particularly precarious. Psychologists 
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create psychology, he suggests not when they are experimenting or observing, but when they are writing. 
Psychology, in other words, is a sort of post hoc imaginative embroidery constructed subjectively from the 
meaningless dust of empirical data. Two psychologists, Liam Hudson and Henri Tajfel, lead him to this odd 
conclusion. But Tajfel admits that he has ‘never really been a scientist’, and Hudson thinks that the model 
of the physical sciences has been ‘disastrous’ for psychology. So how firm is a conclusion based on such a 
foundation? Are all psychologists of this description? From such a sample it is presumptuous indeed to go on 
to ‘offer a list of those problems which it seems to me must be resolved before there can be a general 
advance in psychology’. 

David Cohen is a journalist who specializes in writing on psychology, and it is as journalism that his book 
should be judged. As such it is good journalism, and interesting light reading for psychologists. Cohen has 
done his homework well, draws out his victims skilfully, and adds life with his descriptions. Although it 
doesn’t add anything from a scientific point of view to know, for example, that Donald Broadbent has ‘a 
round moon face’, that he can be very funny, positively twinkles when he makes a joke and ‘peppers his 
conversations with laughs’, it is nice to be reminded of these facts. It is fascinating to learn that Skinner’s 
first experiments were done with baby squirrels tugging at dangling peanuts in Harvard Yard. The 
psychologists and quasi-psychologists come to life in the interviews, they reveal their prejudices and their 
hobby-horses, and they often give themselves away, as Skinner does, for instance, when he makes the 
incredible statement that he knows of no physiological fact that throws any light on behaviour! 

The interviewees, in addition to those already mentioned, were McClelland, Eysenck, Festinger and Neil 
Miller. As bedside reading for psychologists the book can be recommended. 

L. S. HEARNSHAW 


Aspects of Reading Acquisition. Edited by J. T Guthrie. Baltimore: Johns Hopkins University Press. 1976. 
Pp. 222. £9.70. 


This volume represents the proceedings of a two-day conference held at Johns Hopkins in 1974. It contains 
eight chapters. Two of these are excellent: Benson’s chapter on alexia, and the review by Satz, Fruel & 
Rudegeair of the progress they have made so far in their extensive longitudinal study of the antecedents of 
specific reading disability. 

Benson proposes yet another taxonomy of the alexic syndromes, distinguishing between primary alexia 
(usually called ‘pure alexia’ or ‘alexia without agraphia’ and due to disconnection of the left angular gyrus 
from the visual cortices), secondary alexia (alexia with agraphia and symptoms of fluent aphasia; this is 
ascribed to damage in the left parietal-temporal region, especially the angular gyrus), and tertiary alexia 
(alexia with agraphia and symptoms of dysfluent, i.e. Broca’s, aphasia; this is ascribed to left frontal 
damage). This tripartite classification seems to me to represent a considerable advance over the many other 
attempts at classifying the alexias, including the earlier fourfold classification proposed by Benson & 
Geschwind several years ago. The chapter offers a balanced and readable survey of an often bewildering 
area of research, and manages to avoid technical terminology without sacrificing accuracy. 

Satz, Fruel & Rudegeair are carrying out an investigation in which the results of a battery of tests 
administered to a child when he is in kindergarten are used to predict how well he will learn to read over the 
next three or more years. They administered the test battery to 96 per cent of the white male kindergarten 
population of Alachua County, Florida, in 1970 (n — 497). Of these boys, 459 were still available for testing 
at the end of Grade 3 (1974). Reading assessments classified 55 of these as ‘severely retarded’ and 77 as 
‘superior’. Application of multiple discriminant function analysis revealed that 91 per cent of the ‘severe’ 
category and 94 per cent of the ‘superior’ category were correctly predicted by the kindergarten battery. 
Cross-validation was carried out by applying the lambda weights derived from this discriminant function 
analysis to the kindergarten battery scores of a new group of boys (n= 181) tested in kindergarten in 1971. 
When their reading was assessed nearly three years later (at the end of Grade 2), 89 per cent of the 18 
‘severely retarded’ readers and 93 per cent of the 40 ‘superior’ readers had been correctly predicted by the 
kindergarten battery. These are impressive hit rates; they indicate that those nine year old children whose 
reading is unusually good or unusually bad could have been identified with a considerable degree of certainty 
when they were six years old and had not yet begun to learn to read. Such accurate prognosis must be the 
first step in attempting to reduce the incidence of reading disability, since if one can only identify the poor 
reader by the fact that he has not made progress in learning to read, it may be too late to do anything about 
it because of the effects on the child of his failure to learn to read. Satz et al. do not discuss what action 
might be taken with children identified in kindergarten as prospective poor readers, but the relative 
effectivenesses of various forms of action could now be directly investigated by them, given that they have 
an instrument for the accurate identification of such children at an early stage. They do argue, however, for 
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the developmental-lag view of backwardness in reading. It is hard to see why this view is so popular: if a 
kindergarten child with a poor prognosis for reading just needs a little extra neurological maturity, all one 
would need to do is to delay the beginning of reading tuition until this extra maturity has been achieved: no 
special treatment of the child would be needed. I doubt if this is the case. Nevertheless, this work by Satz 
and his colleagues is obviously of major importance, and their chapter will be of great interest to everyone 
interested in what can be done about reading disabilities in young children. 

It is a pity that the remaining six chapters of the book are not as good as the chapters by Satz et al. and 
by Benson. There is a long chapter by Entwistle entitled “Young children’s expectations for reading’ which 
unfortunately is not about what a child thinks reading is before he starts to learn to do it; instead, it reports 
the results of a large number of studies in which children are asked to predict what grades for reading, 
arithmetic and conduct they will be awarded in their next report card. Children’s predictions of their grades 
have quite high reliability but very low validity. Menyuk discusses the important topic of relations between 
acquisition of phonology and reading, but not in a particularly enlightening way: many of the analogies she 
suggests are most unconvincing (e.g. that scribbling corresponds to babbling). Kinsbourne's chapter, like that 
of Satz et al., is also about the prediction of reading disability, but Kinsbourne's battery predicted failure for 
50 of the 185 children who turned out to be good readers. The chapter contains a flurry of ideas about reading, 
none with much substance. Samuels contributes a worthy but pedestrian discussion of whether it is useful to 
treat reading as consisting of a hierarchy of component subskills. The chapter leads nowhere; nor does 
Resnick & Beck's chapter, which presents yet another ‘Model of the reading comprehension process ' (15 
boxes, 22 circles, and 46 arrows) as well as models for grapheme-phoneme decoding and for pronouncing 
graphemes. The chapter contains no data, or indeed evidence of any kind, so no reasons are given for 
preferring their models to anyone else's. The book concludes with a short summary by Williams, 
commenting on each of the preceding seven chapters in a remarkably uncritical way. 

In sum, then, six of these eight chapters are not of much interest, whereas the remaining two are 
outstanding. I should also point out that the book is poorly produced, with a disgraceful number of 
misprints, numerous bibliographical errors, and, in some chapters, much poor writing which should have 
been remedied by editorial intervention. 

MAX COLTHEART 


Au Integrated Theory of Linguistic Ability. By Thomas G. Bever, Jerrold J. Katz and D. Terence 
Langendoen. Hassocks, Sussex: The Harvester Press. 1977. Pp. 432. £14.95. 


A more appropriate title for this book might have been ‘Papers towards an Integrated Theory of Linguistic 
Ability’, since no overall theory is coherently formulated, though much good exploratory work is done. Such 
a title would also have prepared the reader for the frequent repetition which comes from either printing for 
the first time or reprinting material originally produced as self-contained articles, not all of which are in fact 
written by the scholars whose names appear on the cover. 

The general argument strategy evinced by the book (henceforth abbreviated to ITLA) may be summed up 
in the maxim ‘divide and rule’. The contributors seek to check a current in linguistic theorizing over the last 
decade or so, primarily due to the so-called generative semanticists, which has tended towards the 
establishment of a monolithic theory of all aspects of language and language use. Their point of departure is 
what has come to be known as ‘the standard theory’ (ST): the theory of grammar as set out in Chomsky's 
Aspects of the Theory of Syntax (1965) in turn building on Katz & Postal's An Integrated Theory of 
Linguistic Descriptions (1964), a title which is clearly echoed in the present one. As is well known, this 
theory both specifies a number of internal levels of linguistic analysis (deep structure, surface structure, 
semantic representation, phonetic representation) and also maintains a rigid separation of intra- and 
extra-grammatical phenomena through the competence/performance distinction. The general tenor of 
ITLA is that these four interlocking levels of ST provide a paragon of conceptual neatness which should 
not lightly be discarded, and that apparently damaging counterevidence should, wherever possible, be 
neutralized by attempting to show that it is most appropriately accounted for by one of a number of 
independently motivated theories covering areas within the general domain of performance. 

The validity of such an approach depends to a large extent on whether these extra-grammatical theories 
are strong enough to bear the burdens imposed on them, since without this the claims are explanatorily 
empty. This much is, of course, recognized by the authors, and ITLA contains proposals regarding 
sociolinguistically controlled register variations (Bierwisch), pragmatics and presupposition (Katz & 
Langendoen) and conversational implicature (Harnish). Most important in the context of the present review, 
however, are the papers dealing with the interaction of mechanisms of speech perception with systems of 
grammar. We have Bever's early article "The influence of speech performance on linguistic structure' (1970) 
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conveniently reprinted together with work which builds on this foundation, notably Langendoen et al.'s 
‘Dative questions’ (1974), and Bever & Langendoen's interesting foray into historical linguistics ‘A dynamic 
model of the evolution of language’ (1971). All of this work follows the lead given by Chomsky’s classic case 
of uncomfortable data being explained away on external grounds, namely the type of centre-embedded 
construction which is perfectly acceptable in the rat the cat chased ate the’cheese, but which dissolves into 
total unacceptability if the nesting is repeated as in the rat the cat and dog bit chased ate the cheese. This 
last sentence can be classified as grammatical (i.e. generated by the grammar) but unacceptable, thereby at 
one stroke accounting for its non-occurrence and yet obviating the need to put an arbitrary restriction on 
recursion in the grammar. 

The enterprise is on much shakier ground when Bever & Langendoen in 'Can a not unhappy person be 
called a not sad one?’ (1973) begin to investigate the possibility of acceptable ungrammatical sentences. 
Recall that in Chomsky's original and classic formulation a grammar must generate all and only the 
grammatical sentences of a language. Under this view, it is perfectly coherent that the grammar should 
generate, for example, multiply centre-embedded sentences, although they will later be ruled out on 
performance grounds. On the other hand, it is rather less than clear how a theory of performance can 
accept non-existent, because ungrammatical, sentences. There is a genuine dilemma here. Either the 
‘ungrammatical’ sentences do exist but are not generated by the grammar, in which case the competence/ 
performance distinction, so crucial to the ITLA approach, breaks down, or they are generated by the 
grammar, with the consequence that the whole question of acceptable ungrammaticality vanishes in a puff of 
smoke, which is probably the best fate for it. 

The argument motivating the idea of sentences being both ungrammatical and acceptable is a 
time-honoured one in transformational circles. The claim is that the grammar could only be made to generate 
the problem sentences at the expense of considerable complexity in the way of ad hoc conditions on rules 
and other notoriously costly theoretical devices. Thus, the argument goes, it is better to leave the rules in 
their ‘simple’ formulation, and try to find other means of dealing with the difficult cases. This same argument 
had a clamorous success when used by Chomsky in Syntactic Structures to move the superiority of 
transformational grammars over phrase-structure grammars. It is interesting that recent work among the 
so-called interpretative semanticists has focused on, among other things, a richer theory of phrase-structure 
rules with a concomitant reduction in the number and power of transformations. The omission of any real 
discussion of this work is an important gap in the overall thesis of ITLA. If ST is going to be preserved by 
shifting the responsibility for dealing with potential counterevidence onto adjoining and interacting theories, 
not only must these theories be formulated in sufficient detail, but ST must itself be worth preserving. This 
point is touched on in several of the contributions, and is the central focus in two: ‘The fall and rise of 
empiricism’ by Katz & Bever, and Katz ‘Global rules and surface structure interpretation’. The former is a 
methodological polemic against generative semantics, which, it is argued, has led linguistics back to 
empiricism. If syntactic structures are derived from a semantic base which thereby conditions their form, 
and if, as seems reasonable, semantic structures reflect universal properties of the outside world, then the 
arguments for an autonomous set of syntactic universals, which form the core of the rationalist-innatist 
approach to language, are seriously undermined. The point is a good one, and the author's discussion well 
worth reading, but the general thrust seems misplaced. If the factors force us to adopt a generative 
semantics view of language, then so be it, regardless of whether or not this involves a return to empiricism. 
Fortunately, I do not think the facts dictate any such move. 

The other main defence of ST relates to the question of generative power. The usual arguments are put 
forward against ‘global’ and ‘transderivational’ constraints on the grounds that they enormously weaken 
linguistic theory by increasing the class of solutions available for any given linguistic problem. An interesting 
variant of this argument is developed by Katz, who suggests that the Extended Standard Theory (EST) 
developed in the recent work of Chomsky, Bresnan, Jackendoff and others also requires global power, 
despite its proponents’ claims to the contrary, since it permits semantic interpretation at both deep and 
surface structure levels. The major factor bedevilling all arguments of this kind is the by now notorious (and, 
in many quarters, notoriously neglected) results of Peters and Ritchie showing that transformational 
grammars of the standard type are formally equivalent to unrestricted rewrite systems, which is about as 
powerful as you can get anyway. In view of this, much more discussion of Chomsky’s work in the field of 
developing restrictive conditions on transformations would have been welcome. 

Let me conclude by repeating that there is much to interest linguistic and psychologist alike in this book, 
but, as I have tried to indicate, 1mportant weaknesses seems to me to remain in the overall structure of the 
argument. 

NIGEL VINCENT 
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Helping Troubled Children. By Michael Rutter. New York: Plenum Press. 1975. Pp. 376. 


If an individual possessing no specialist knowledge wished to gain some understanding of the theories and 
principles of child psychiatry, he need do no more than read this book. Professor Rutter has, with 
characteristic lucidity, managed to provide an elegant account, free from heavy professional jargon, of the 
ways in which child development and childhood disturbance are seen from a modern psychiatric viewpoint. 
He provides excellent coverage of most of the fundamental issues of child psychiatry and illustrates them 
with a plethora of case histories which should appeal to the voyeurism of most readers. 

A pleasant feature of this work is Rutter’s willingness to acknowledge that a child’s environment exists 
beyond the clinic walls. He takes a broad view of cause and effect in childhood disturbance in relation to 
environmental contingencies and emphasizes that in the majority of cases ‘treatment’ must take the form of 
intervention at home, school or wherever the problem exists. This attitude gives the book vitality and should 
do much to dispel the commonly held opinion that psychiatry is still a clinic-bound, inward-looking 
profession blinded to reality by self-substantiating precepts. 

An aspect which the reviewer found particularly refreshing was Rutter’s description of the assessment 
process based on a hypothesis formulation/testing model. In simple terms he outlines the scientific method 
which should underpin all assessment work with children and emphasizes the need to verify the final 
diagnostic hypothesis by evaluating the results of the treatment implied by it. The sad fact that this seldom 
happens in clinical practice in no way detracts from his account of what should happen. 

The structure of this book is quite logical. Rutter presents a good chapter on child development in which 
the biological aspects of growth are particularly well explained for parents and other non-medical people 
caring for children. He moves on to cognitive, language and emotional development emphasizing important 
stages and laying to rest some popular misconceptions arising from dubious psychoanalytic theory. Next he 
investigates individual variation and discusses the effect of family, school and peer group interactions. 
Following chapters present clear descriptions of emotional and conduct disorders. Most primary school 
teachers would benefit from reading his explicit and comprehensive resumé of learning problems in which 
some of the myths concerning underachievement and IQ are clearly settled. The final chapter is concerned 
with types of treatment. A lack of clarity creeps into his description of psychotherapy but this is forgivable 
in view of the lack of any concise agreement amongst the majority of psychotherapists as to its nature and 
aims. Otherwise his review of treatment methods is clear and instructive enough to warrant attention from 
trainees in all professions concerned with therapeutic work. His stance in this chapter is particularly 
noteworthy in that, with Graham (1974), he urges an open-minded appraisal of all therapies rather than a 
bigoted adherence to only one as is implicit in the works of many psychiatrist writers before him. 

Perhaps because the reviewer is an educational psychologist rather than a psychiatrist he found a 
disturbing and insidious element running throughout the book. Although Rutter takes a broad 
multidisciplinary view and states that '. . .a collaboration between equals is an essential part of the 
professional approach’ he still implies that his non-medical colleagues work within a psychiatric framework. 
Even when a therapeutic intervention takes place in the form of environmental manipulation he continues to 
suggest that it is a treatment falling within the medical sphere. Many non-medical professionals working 
toward the amelioration of children’s problems would baulk at this; one doubts whether a generic social 
worker offering case work within the community would claim any allegiance to a psychiatric creed and 
certainly the majority of educational and clinical psychologists would feel affronted if classed as ‘lay 
therapists’ or ‘para-medicals’. 

An otherwise excellent book is spoilt by this insistence on the medical model; the terms ‘diagnosis’, 
‘prognosis’ and ‘treatment’ appear with a distressing frequency to such an extent that the troubled children 
referred to in the case histories always seem to have something wrong within them. This draws attention 
away from the maladaptive aspects of their homes or schools which Rutter presents as the real causes of 
their difficulties. Although he is clearly aware of the problems of diagnostic labelling that this medical model 
produces he paradoxically emphasizes the importance of the process and presents the deliberations of the 
W.H.O. (Rutter et al. 1969) as window dressing. 

Educationalists have been happy to abandon this type of model (e.g. Tyson, 1970) and special education, 
in particular that concerning ‘maladjusted children’, has been healthier since. Why does psychiatry lag 
behind? Should it, as some suggest (e.g. Tizard, 1973) be confined only to the severest cases requiring 
in-patient treatment; its skills having little or no relevance to the needs of the vast majority of troubled 
children? 

PETER RANDALL 


516 Book reviews 


GRAHAM, P. (1974). Child psychiatry and TIZARD, J. (1973). Maladjusted children and the 
psychotherapy. J. Child Psychol. Psychiat. 15, child guidance service. London educ. Rev. 2 (2), 
59—66 22-37. 

RUTTER, M., Lesovici, L., EISENBERG, L., Tyson, M. (1970). Remedial aspects, in P. Mittler. 
SNEZNEVSKU, A. V., SADOUN, R., BROOKE, E. & , The Psychological Assessment of Mental and 
Lin, T. Y. (1969). A tri-axial classification of Physical Handicaps. London: Methuen. 


mental disorders in childhood. J Child Psychol 
Psychiat. 10, 41-61. 


Delinquency and Psychopathology. By D. O. Lewis and D. A. Balla. New York: Grune and Stratton. 1976. Pp. 
209. $14.75. 


Since its inception in 1971, both authors (a psychiatrist and a developmental psychologist) have been 
members of the staff of the psychiatric clinic attached to the Juvenile court in New Haven, Connecticut. In 
this book, they report their impressions of the children seen at the clinic. During its first year of operation 5 
per cent of the total court population were referred to the clinic. Several epidemiological studies carried out 
by the authors, in an attempt to assess how representative their clinical impressions were of the total court 
population, are also reported. 

Among the clinic sample, the authors observed an association between delinquency and psychopathology. 
Extensive ‘diagnostic evaluation’ of the delinquents seen at the clinic, involved interviewing the child, and 
where possible his or her parents; obtaining information on the child and his family, from outside agencies; 
psychological testing; and where requested, neurological evaluation of the child. This indicated that 
approximately a third of the children seen at the clinic ‘demonstrated clear evidence of serious 
psychopathology’, defined by the authors as ‘symptoms of significant central nervous system dysfunction, 
psychotic symptomatology, or both’. Also, for those parents of delinquents for whom adequate information 
was available, nearly three-quarters were classified by the authors as demonstrating significant 
psychopathology. 

Comparisons of the clinic sample with the total court population revealed that their age, sex, race and 
social class distributions were similar, but that the clinic sample had committed significantly more offences. 
Due to objections from the court staff, the authors were unable to pursue the most obvious test of how 
representative the observations of the clinic sample were of the total court population, namely, to subject a 
random group of non-clinical-referred children to the assessment procedures used at the clinic. Thus, they 
had to rely on retrospective investigations of various groups drawn from the population of Connecticut, 
concerning their official rates of delinquency and hospitalization ın selected psychiatric hospitals in the area, 
to provide the necessary epidemiological evidence. The studies reported demonstrate a similar, if not as 
substantial, association between delinquency (filial) and psychopathology (filial and parental), as defined by 
the authors. 

Perhaps the most significant finding reported in this book is that delinquency is associated with minimal 
brain damage and psychosis rather than, as reported in most of the previous research, sociopathy. The 
authors suggest that a detailed clinical assessment of a child (such as theirs) is essential if any underlying and 
often obscure psychopathology is to be recognized, and that previous studies which have failed to carry out 
such detailed assessment have, as a result, overestimated the incidence of the most obvious explanation of a 
delinquent's behaviour - sociopathy. As a result of adopting such a broad definition of sociopathy, the 
authors believe that delinquency and sociopathy have become ‘for all intents and purposes synonymous’, 

Unfortunately, the book provides very little statistical data concerning the characteristics of the clinic 
sample (e.g. age, sex and social class distribution); the parents of the children: or the variety and incidence 
of the psychopathology observed in the children. Accepting the authors’ sentiments that ‘no statistic can 
convey the quality of a psychotically disorganized child. . . [nor]. . .reflect the kinds of disorder observed in 
many parents of delinquents’, such data might have complemented the many case histories reported, 
particularly when one considers the rather controversial nature of the findings and the authors' views on 
them. 

The problem of how representative the clinic sample is of the total court population is not resolved in the 
book. No mention is made of the criteria for referring children from the court to the clinic, but the text does 
provide some indication: ‘Buddy’s [a delinquent] probation officer had worked closely with the clinic on 
several other cases, and was quick to recognize that Buddy’s [head] injury might have something to do with 
his behaviour. He therefore requested a psychiatric evaluation’. The clinic staff were instructed to provide 
training for the court probation officers (agents of referral), knowing this and assuming that the authors’ view 
and preliminary findings were freely aired in the clinic (as they were in academic journals) the possibility of 
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selective referral, by the probation officers, cannot be dismissed. Data concerning the referral rates of 
individual officers, and any trends over time (as the officers' familiarity with the findings, presumably 
increased) might have helped clarify the issue. 

Given the disconcerting lack of statistical data, the doubts concerning how representative the clinical 
sample is of the total court population, and the restricted epidemiological evidence; the nature of the 
findings, and the way the authors deal with them (particularly impressive is their discussion of Rosenthal's 
‘schizophrenic’ spectrum of disorders) mean that this book should not be overlooked by the student of 
delinquency. 

DAVID LUCEY 


Mental Disorder: An Introductory Textbook for Nurses. By H. Snell. London: George Allen and Unwin. 1977. 
Pp. 200. £2.95. 


Pupil nurses are the golden goslings in psychiatric hospitals. In two short years they become Enrolled 
Nurses, with important back-up responsibilities in ward or day unit. They are in close daily contact with 
residents, who may have rarer chances to talk to more senior nurses. Racialism apart, the quality of insight 
brought by the 'coal-face' nurse to her relationships with confused, disturbed or frightened people is clearly 
central to the 'cure', when and if it occurs. 

Unfortunately, the golden goslings have received little golden grain in the way of preparatory texts. Pupil 
nurses are alternatively either insulted by otiosely simple presentations, or insulted and bored by referral to 
complex, student-oriented texts. The tacit implication is either that they are ‘thick’, or that they are really 
failing students who ought to try a little harder to understand. 

Mr Snell's book comes as a refreshing and competent addition to the scanty literature available to 
psychiatric pupil nurses. Avoiding the ‘men of bronze’ myth, he writes for this heterogeneous group in a 
clear, non-patronizing and eminently readable style. The book itself is well produced, clearly printed, 
reasonably durable and modestly priced (With a little ‘do it-yourself’ covering, enterprising pupils can make 
of it a vade mecum to last out training). Throughout the text, a useful mnemonic ‘pulls out’ key ideas and 
useful terms in emphatic type. Nine black-and-white plates illustrate typical ‘involvements’ — a young 
schizophrenic crouches in the corner of his room ‘losing contact with reality’; nurses are seen in individual 
and group therapy with patients, helping out with ECT and intravenous chemotherapy; glimpses are given of 
a long-stay ward and occupational and industrial therapy units in action — all relating well to the text. (The 
quality of these monochrome photos is rather sombre — but then so are many of the real-life situations.) 

The book discusses mainly Section J items of the GNC syllabus, with a specially thorough coverage of 
common features of mental illness and ‘disordered’ behaviour, and related problems of nursing care and 
management. Early chapters provide a brief history of care and treatment; an outline of psychological 
development, pointing up the development tasks and ‘critical stresses’ faced at successive stages; and an 
overview of the ‘causes’ of mental illness. Here the author skilfully relates to his developmental discussion, 
emphasizing the multifactorial nature of causation, with special reference to personal experiences at critical 
stages of child and adult life - child-rearing practices, childbirth, marital and work stresses, finance, 
adjustment to middle age, retirement and bereavement are among social factors briefly and clearly outlined. 
Mr Snell’s commonsense and pragmatic approach are shown by typical statements in his chapter on ‘The 
concept of mental illness’ (descriptions of mental illness as a disorder of the mind *. . . tell us very little’; 
being labelled as mentally ill '. . .depends largely on behaviour’; what is regarded as constituting 
normal/abnormal behaviour '. . may differ only in degree’ — and is also in part a function of ‘. . .age, 
education, background and the standards of the society in which he lives’). Symptomatology is discussed in 
terms of demonstrable disturbances in general behaviour, activity level, mood, perception, memory and level 
of consciousness. There are useful and accurate chapters introducing important topics such as epilepsy, 
mental handicap, ‘institutionalism’, and the management of aggressive and suicidal behaviour. The book 
ends with an outline of psychological and physical treatments; aims and methods of rehabilitation; and the 
resources available for the care of patients in the community. 

During early training days learners are sorting the jargon, bombarded by polyglot from the medical, social, 
analytic and behavioural schools. A valuable feature of this text is its lucid definitions — systematic use of the 
index combined with brief note-taking soon yields a useful working glossary. There are one or two misprints 
- ‘emotional liability’ (p. 42) will cause some head-scratching; and ‘ETC’ (p. 187) is not often done 
nowadays — whilst ‘masochism’ displays self-deprivative tendencies by foregoing an 'h' in the index (p 199) 
Context, the oral tradition and subsequent editions will take care of these small faults. 

An excellent feature of the book is its case illustrations — 14 readable and realistic vignettes presenting 
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people and situations readily identifiable in the learner's daily experience. They are a vehicle for much 
‘painless’ learning. What is it that compels Jack Roberts, a level-headed, hard-working milkman, to grovel 
on his knees by his hospital bed, his face a pallid mask of anxiety? Are ‘they’ really ganging up on young 
David Cook, and who is he talking to, apparently all alone in his room? Why does Mrs Swift refuse to shake 
hands with the nurse, and are those really cabalistic signs she is making with her spoon over cup and sugar 
bowl? Many of the answers are here, and with them - adroitly introduced — accounts of observation and 
history-taking, crisis and maintenance chemotherapy, ‘talking through’ in individual and group therapy, 
desensitization, the supportive role of community psychiatric nurse, the benefits of good occupational 
therapy and so on. These case illustrations also point up good emphatic/supportive behaviour on the part of 
nursing staff. I personally felt that good, pithy line sketches of these situations would increase their impact, 
and offset a lack of visual material in the central section of the book. 

Each chapter ends with an average of eight ‘typical’ questions which pupils may expect to encounter in 
written exams, 12 per cent of which are culled from previous GNC papers. Beginning students - Mr Snell's 
other target group - would also have benefited from a brief, chapter-by-chapter reading list. 

All in all, Mr Snell has succeeded admirably at a most difficult task — the writing of an elementary textbook 
which is at once concise, authoritative and interesting. 

VAL. REED 


The Development of Traditional Psychopathology: A Source Book. By Mark D. Altschule. New York: Wiley. 
1976. Pp. 330. $19.95. 


Source books are published to meet, at least, two kinds of need; firstly to make available to the scholar 
relevant morsels from inaccessible works; secondly, to express a point of view concerning a concrete 
historical problem. The editorial techniques involved in these modes are, in consequence, very different in 
themselves. In the former case (Type I) selection of representative fragments must be made in terms of a 
proper understanding of the entire work one happens to be sampling. In the latter case (Type IT) the editor is 
allowed an important licence: he can analyse the work into its primary components and make use of any of 
them to support the historical hypothesis he wants to sell. An editor wanting to take advantage of this 
licence however must pay an important price: namely, provide adequate, solid background information upon 
which his chosen bits can be meaningfully grafted. In no case can an editor have it both ways. 

These principles can be illustrated by referring to two examples; the first V. Skultan's notoriously bad 
Madness and Morals which fails on every account. It anthologizes works easily obtainable; it abuses 
chronological continuity, thus giving the impression of a ‘linearity of thinking’ which was totally alien to 
19th century British psychiatry; it offers very inadequate — and sometimes inaccurate — background 
information. The second, Hunter & McAlpine's Three Hundred Years of Psychiatry. Here we have a 
straightforward, honest effort of Type I above. It does not, to its credit, pretend to any historiographic 
sophistication; its fragments are chosen with common sense and gusto. In spite of the occasional 
anachronistic comments and general tendency to read history backwards, it fulfills its function superbly well. 

What can we now say about Dr Altschule’s source book? In the 1975 Totts Gap Colloquium on the 
Biology of Schizophrenia the author had already whetted our appetite concerning the publication of this 
work. This he did by going into print with delphic statements such as ‘in the 17th century when the concept 
of schizophrenia was developing. . .’, for whose substantiation we have been eagerly awaiting. Alas none 
such is forthcoming. Nor indeed is the claim made in the blurb that ‘It is a true [sic] “sourcebook”, going 
back to original texts [one wonders what else can a source book do!], checking available translations for 
accuracy, and providing new texts where misstatements were found. . .’ quite borne out by the results, 
although at least one recent reviewer seems to have swallowed it hook, line and sinker (Br. J. Psychiat. April 
1977). 

Let us organize our critical comments of this source book whose structure seems to follow Type II above. 
Its text covers a list of ten large topics (divided up into subtopics). No rationale is provided for the inclusion 
or exclusion of topics and one notices that some noble ones (noble both in terms of their centrality to 
psychiatry and the extant literature available on them) are not dealt with: e.g. ‘imperative-obsessional 
syndromes’. ‘will disorders’, ‘moral insanity ’, ‘phrenology’. Likewise, a pleiad of historically very important 
signs which can best be described as ‘signs that are no more’ (like otohaematoma, unilateral hallucinations, 
stigmata of degeneration, etc.) are not even mentioned, probably because they are considered as not relevant 
to present-day psychiatry. 

Conversely, other topics owe their inclusion to sheer anachronism. What else can be said of the presence 
of 'schizo-affective disorders’ (p. 206) and of the claim that ‘the concept is ancient’? Needless to say this 
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disorder could neither exist as a name nor as a concept before the Kraepelinian dichotomy made it possible. 
Likewise, to interpret early historical references to Melancholia and Mania as conveying their present day 
sense (as Dr Altschule does and others have done; p. 139) is to write history backwards. There is absolutely 
no evidence for this correspondence other than the possibility that within the large group of those behaviours 
described as ‘melancholic’ by Hippocrates, Aretaeus, Galen, etc., some present-day ‘depressions’ may have 
been included. For what is often forgotten is that the notion of melancholia was not for the Greeks a 
‘psychiatric’ category but a general medical concept which seemed to have referred to situations where low 
rates of motor activity and general reduction of behavioural output were the main characteristics. 

Since a source book Type II must provide adequate background information we should expect this book to 
contain meaty introductions. Again we are disappointed. The short introductory paragraphs at best lack in 
balance and at worst are frankly misleading. We read: ‘after a British controversy about who had discovered 
the unconscious mind had simmered down. . .' (p. 18). What British controversy? The controversy centered 
on the notion of ‘unconscious cerebration' that featured Laycock and Carpenter? The debate on the 
philosophical notion of unconscious as was staged by J. S. Mill and Hamilton? The temporary ripples caused 
by Coupland's three volume translation of von Hartmann's Philosophy of the Unconscious? This kind of 
supremely vague statement fails therefore on, at least, two accounts. It either refers to a very well known 
historical point or else to a debate which Dr Altschule himself has just unearthed. If the former is the case 
then at least the locus classicus ought to have been quoted. If the latter, pertinent historical clues should 
have been produced, for I would not like my students to refer to the 'British controversy on the 
unconscious ' without an adequate historical anchoring. In the event (it would seem that this bizarre claim 
meets neither horn of the dilemma), unless I am very mistaken, no work exists on the development and 
vicissitudes of the notion of unconscious mind in British intellectual history, apart from Northridge's short 
introduction — and this does not refer to any ‘British controversy’. 

Often comments are not misleading but urelevant. For example, in the introduction to the problem of 
‘Ego’ in psychiatry we suddenly are told about REM sleep and ‘the growing permissiveness of American 
parents’. Likewise, when illustrating the area of ‘antisocial behaviour’ (where, by the way, the overworked 
troupe of writers on ‘moral insanity’ is paraded) we find some newcomers like the Swiss Matthey. Whether 
Dr Altschule believes that he is important as a primary or as a secondary writer (which he might well be) and 
why, we are never told. 

Let us now confront the next claim made about this book; that it is accurate. On p. 293 we find a 
translation from Janet’s Les Obsessins et la Psychoasthénie (sic). The fragment is taken from this work’s 
table of contents and there we find ‘les éclipses mentales’ (which in the original French is part of ‘les troubles 
de l'intelligence ', No. 3) translated and classified as No. 4 — indeed, as a new type! In the French original 
No. 4 is the numeral corresponding to ‘les troubles des émotions et des sentiments’. Furthermore, the 
‘Troisième Section’ of the French edition concerning the all important ‘insuffisances physiologiques’ (all 
important that is for he who knows of the relevance of the physiological background to the understanding of 

—the notion of ‘psychasthénia’ both in Arnaud and, of course, Janet) is cut out altogether. So much for 
accuracy. 

A final example. On p. 119 Dr Altschule writes ‘a broader approach to the problem of anxiety was actually 
begun by Locke and later taken up by the psychiatrist Battie. . .' and includes a quote from Locke's Essay 
to prove his point. Apart from the curious referring style (Locke is page-referred to one of the countless 
editions and not by book, chapter and paragraph) the incomplete quotation reprinted (11, 21 para. 31) is 
utterly misinterpreted. In fact it constitutes, with others in the same chapter, the effort (unsuccessful as it 
happened) made by Locke to solve the problem of the ‘activation of the will’. By considering ‘uneasiness’ 
(the term that Dr Altschule, all too eagerly, considers as synonym to anxiety in the modern sense) as 
equivalent to desire (see: ‘For desire being nothing but an uneasiness in the want of an absent good...’ 1, 
21, 31) Locke 'made volition an issue of the physical system, and man, even in the deepest root of his being, 
a part of nature’ Fraser aptly said in 1894. But this issue has little or nothing to do with ‘anxiety’ in the 

' ‘psychological’ or ‘physiological’ sense with which we use it nowadays! So what is this ‘broader approach’ 
to which Dr Altschule refers? Is it some sort of ‘existential’ level? If he simply wants to say that there was 
a shift in the meaning of ‘uneasiness’ during that period, namely that it was sometimes used as tantamount 
to ‘anxiety’, why choose the British philosopher and not, say, Shakespeare who in 1599 had already 
written; ‘There’s not I thinke a subject/That sits in heart-greefe and uneasiness/Under the sweet shade of 
your government’ (Henry V, 11, 2) where, clearly the term is used in that sense? 

But this is not all. Whence comes the claim that Battie took this ‘broader approach’ from Locke? Is it 
simply because Battie states; ‘By which present uneasiness, according to Mr Locke’s just observation, the 
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will is determined’? A close reading of the relevant text, however, does not bear out the existence of any 
direct line of influence from Locke to Battie concerning the ‘new view of anxiety’. In fact, it would seem 
that the term ‘uneasiness’ is used by Battie in at least two different senses (senses by the way already 
present in Burton's Anatomy of Melancholy) and that his reference to Locke does not entail a ‘continuity’ of 
thought but simply a good 18th century literary device. 

Unfortunately there is no space left for more examples. In summary, we can say that this expensive book 
does not live up to its promise, is often misleading and any beginner's chances of acquiring a balanced view 
of the history of psychiatry would certainly be jeopardized rather than helped by it. On the other hand, the 
initiated may find some use for it, namely as a quick source of reference but certainly not as a source book. 
But then, what a strange way of finding information for the historian of psychiatry whose job should be 
mastery of the primary sources! I can, therefore, find no real audience for this book. This is a pity, for Dr 
Altschule's historical skills (as shown in his publications over the years) combined with his ideal placement 
1n the Francis Countway Library of Medicine at Harvard should have managed to produce a much better 
volume than in fact they have. 

G. E. BERRIOS 


Information Processing and Cognition. The Loyola Symposium. Edited by Robert L. Solso. Hillsdale, N.J.: 
Lawrence Erlbaum. 1975. Pp. 438, £9.85. 

Cognitive Theory, volume I. Edited by Franke Restle, Richard M. Shiffrin, N. John Castellan, Harold R. 
Lindman, David B. Pisoni. Hillsdale, N.J.: Lawrence Erlbaum. 1975. Pp. 303. £9.75. 


A recent survey showed that editors of symposia and conference proceedings are two and a half times as 
likely as their colleagues to suffer from fits of depression alternating with entirely irrational periods of 
optimism. The average editor should expect his best contributor to deliver his manuscript six months late, 
covered in illegible pencil corrections with all the references missing; two manuscripts will be lost in 
mid-Atlantic by the Post Office; three contributors will deliver manuscripts twice as long as requested; most 
contributors will prove incapable of conveying in print the flavour of their conference presentations, with the 
sole exception of the worst contributor, whose flawlessly typed manuscript effortlessly recreates 45 minutes 
of unforgettable boredom. And the editor cannot win: if the book is bad, the blame is his; if the book is a 
success, the credit goes to his contributors. 

The third Loyola Symposium: Information Processing and Cognition is edited, like its predecessors by 
Robert Solso. It is perhaps the most successful of the three. There are three sections, concerned mainly with 
visual information processing, memory and language. The Loyola Symposium style is to give contributors 
room to develop their ideas, and this means that the writer with a substantial body of work to report, or 
important new ideas to communicate, is able to display his wares unhindered by the normal restrictive 
practices of journals. Estes, for example, is able to review his recent work on letter identification, which 
suggests that uncertainty about the spatial location of individual letters is a major contributory factor among 
errors in the perception of words, and Mayzner provides a readable account of a long series of studies on 
the perception of rapidly presented sequences of simple visual patterns. 

Of those papers which choose to emphasize ideas rather than data, two contributions, one by Posner & 
Snyder, and one by Mandler, on consciousness and cognitive control, are outstanding. Mandler describes 
consciousness as respectable, useful and probably necessary: the same could be said of his paper. Roger 
Shepard's paper is a little disappointing: his work on spatial transformations is fascinating, but no new 
material is presented in his paper and some of his theoretical ideas seem very incompletely worked out (for 
example, his idea that there are parallels between his work on the representation of visual information and 
linguists' notions about the representation of linguistic information). 

The psycholinguistic section contains some unusual papers. Anderson is now investigating the problem of 
language acquisition by a computer that processes language by means of Anderson & Bower's HAM 
program. MacNeill's paper on 'Semiotic extension' advances the thesis that language is intimately linked 
with gestures (the argument, as one might expect, consists largely of hand-waving). 

Solso provides his usual effusive introduction: contributions are ‘provocative’, ‘ambitious’, ‘spectacular’ 
and ‘penetrating’. But in this volume, at least, these adjectives are justified. 

Cognitive Theory is the proceedings of the 1974 Indiana Cognitive/Mathematical Psychology Conference. 
It has almost as many editors as contributors, and the papers are shorter than in the Loyola volume. This 
makes for some unevenness in the aims and standards of the contributors. The book contains four groups of 
papers, on speech perception, judgement, short-term memory and memory for text. The section on speech 
perception is of a uniformly high standard: Studdert-Kennedy, Cooper, Wood and Pisoni deal with the nature 
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of phonetic categories, the way phonetic features are processed, and what dichotic listening studies tell us 
about speech perception. This is all very familiar ground for the speech perception specialist, but the papers 
are Clear and not too technical, and especially for the non-specialist it is useful to have the results of much 
contemporary research collected within the same volume. The section on judgement is disappointing: the 
individual papers are not closely related to each other and none makes a substantial contribution. 

The section on short-term memory has some distinguished contributors (Bjork, Craik, and Jacoby, and 
Shiffrin) and will be welcomed by people who like this sort of thing. Shiffrin’s contribution especially is 
useful in identifying the agreements and disagreements between the various authors. I have the impression, 
though, that these authors are solving artificial intellectual puzzles (like crosswords), where the rules are that 
you must explain free-recall and paired-associate learning for unrelated words and nonsense syllables. 
Nobody questions whether this is a worthwhile task. Confidence is not raised by Shiffrin’s comment ‘In the 
next few years I expect we will see the development of models of STS capacity and loss of such complexity 
that the questions of current interest will be superceded and irrelevant’. 

The section on memory for text is useful. The trouble with psycholinguists is that they will spend so much 
time proving the obvious to each other: the demonstration by Paris, that children go beyond the information 
given when they comprehend simple stories, falls into this category, and Potts’ study of memory for linear 
order also fails to surprise. Restle’s is a much more original contribution. He shows that important features 
of several types of text (involving problems of temporal order and quite complicated relations between 
items) can be represented by networks and sets, and then carries out elegantly planned experiments which 
give us significant insights into memory for these texts. Cognitive Theory, like much contemporary cognitive 
theory, is very uneven, but it’s worth asking your library to get a copy. 

PHILIP SMITH 


Heroin Addiction: Theory, Research and Treatment. By Jerome J. Platt and Christina Labate. New York: 
Wiley. 1976. Pp. 417. £14.45. 


This work meets its stated purpose of providing a book-length integration of the massive literature on heroin 
addiction. Included are comprehensive sections on: (a) American and international historical context: (b) 
physiology and pharmacology — a comparatively high-level presentation which assumes knowledge of 
biochemistry — also embracing mortality, medical complications plus effects on various parts of the body, and 
theories of tolerance and dependence; (c) theories of addiction, developmental courses of addiction, and 
background, personality and social characteristics of heroin and polydrug users; (d) treatment; and (e) a short 
section on conclusions plus appendices on prevalence and measurement. 

A certain amount of bias and inaccuracy is probably inevitable in a volume of this scope. The extracts 
which follow should be seen in the perspective of a text which cites about a thousand references and which 
is generally fair and meticulous. Also, especially as the reader gets near to the end of the book, it becomes 
clear that the authors are well aware of the implications of the cultural and methodological limitations of 
much of the literature they cite. 

Some readers will no doubt feel that in the present text they too frequently encounter views that the 
regular non-medical use of opiates is a disease and that users require rehabilitation. For instance: 

‘Further, in comparing groups in which one already exhibits pathology, there is no way to state 
unequivocally that the differences obtained are not the results of the disease process itself, that is, of having 
been a heroin addict at the time of admission in this instance’ (p. 153). 

‘Ideal treatment should produce a stable, productive, law-abiding citizen who is drugfree (Jaffe, .)' 

(p. 234). 

‘In addition to the termination of illicit drug use, “the expectations of the clinic staff [were] that the 
patient [would] change life-style from the underground culture of the drug world. . .to the conventional ethic 
of the American middle class. . .”. The successful maintenance patient should obtain legitimate employment, 
terminate illegal activities and associations with drug users, and become family- and community-oriented. To 
this end individual and group therapy were employed. . .' (p. 285). 

Alternative views to these are certainly presented, especially in various concluding sections on issues and 
criticisms. For example, even in the middle of the chapter on methodone maintenance, the authors describe 
the results of in-patient methadone-on-demand detoxification and conclude that ‘allowing the addict greater 
control over and responsibility for. . .own treatment, far from resulting in wild abuse, manipulative behavior, 
and complete failure, appears to encourage responsibility and maturity in a population notably characterized 
by the absence of both’ (p. 272). Still, perhaps more of the contradictory material should have been initially 
even more qualified or the alternative views and issues cross-referenced. 
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If a reader should arbitrarily decide to trace the accuracy of some references to papers cited as favouring 
compulsory treatment of addicts, some differences of interpretation might emerge. One reads that: '. . coercive 
measures such as hospitalization, close supervision by parole officers and, ultimately, the power to remand 
an addict to jail are utilized in an attempt to make authority real and visible. The use of such authority to 
keep an addict in treatment and to curb destructive behavior has been found by Brill and Jaffe. . .to be most 
effective with adolescents’ (p. 209). 

However, if one actually consults the paper by Brill & Jaffe, it appears that hard data are not given, that 
the kinds of approach favoured are ‘rational authority’, ‘reaching out’, and a ‘public health approach’ - as 
opposed to ‘punitive authority ' - and that this rational approach is said to be particularly useful for 
adolescents, being generally otherwise ‘the most difficult to work with’ - that is, the paper does not seem to 
say that this approach is more effective with adolescents than with other groups. ` 

As a further example, according to Platt & Labate the work of Bowden & Langenauer is said to show that 
‘treatment modalities with an element of compulsory supervision were more likely to effect cure (62%), a 
finding confirmed by the extensive follow-up efforts of Vaillant’. However, Bowden & Langenauer do not, in 
fact, appear to show any significant result in support of this statement. And Vaillant's results are equivocal 
as are those of slightly earlier work which Vaillant cites. 

If one needs a book-length reasonably integrated presentation of current knowledge about opiate addiction, 
this work would be one’s obvious choice. However, the findings must be accepted only with great caution 
and perhaps with cross-checking — suggestions with which the authors would agree. i 
HERBERT BLUMBERG 


The Measurement of Intrapersonal Space by Grid Technique. Volume 1: Explorations of Intrapersonal Space. 
By Patrick Slater. London: Wiley. 1976. Pp. 258. £8.75. 


This is an important contribution to the growing literature on the applications of personal construct theory 
using repertory grid technique. Its importance lies mainly in its being the first book in which different 
workers have described some of their work with the method. But there are several disappointments. Most 
authors describe their research results and seem to take grid methodology at its face value. Only Salmon 
discusses in any depth the general problems encountered, particularly in relation to work with children. 
Barton and colleagues likewise discuss grid usage with a mentally handicapped population, but at a more 
practical level than Salmon. Both these chapters are invaluable for the person wishing to use grids with these 
groups. 

Another disappointment lies in the fact that the majority of the 15 chapters describe work with the 
rank-order or rating forms. The reader would be pardoned for concluding that only rank and rating grids 
exist were it not for the non-conformists. For instance, Landfield describes Kelly’s original Repertory Test in 
his study of students’ suicidal behaviour. Honikman's chapter is particularly wide-ranging. It first covers 
laddering as a method of construct elicitation, then describes several types of grid, including that concerned 
with resistance to change, and ends with a brief but interesting account of the different types of information 
that can be obtained from different types of grid. 

A third limitation of this book is that methods of analysis are largely confined to those developed by the 
editor. Only one chapter, that of Norris & Makhlouf-Norris, describes any specific measures - yet even this 
is derived from Slater’s Ingrid. Nowhere is there anything more than a superficial description of the various 
computer programs (based on principal components analysis) that are used. This means that a fair degree of 
statistical sophistication is assumed in the reader. But this is by design. Slater hopes to whet the readers’ 
appetites by the grid reports in this volume so that they will go to volume 2 (not yet published) where they 
will find statistical details and discussions of methodology. If Slater is correct in thinking the pudding can be 
eaten without indigestion before understanding the blending of its constituent parts, the pudding plus recipe 
are going to be an expensive pair. Certainly this book cannot stand alone and the interested customer would 
do well to hold back from buying until the value of volume 2 can be assessed. 

While regretting the lack of variety in both grid forms, problems likely to be encountered and statistical 
methods discussed, those interested in personal construct psychology and grid method will find this good 
library reading, both for the several interesting chapters mentioned and for the importance nearly all 
contributors place on construct theory showing how, in many cases, the studies themselves arose within that 
theoretical framework. 

FAY FRANSELLA 
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Perspectives on Cognitive Dissonance. By Robert A. Wicklund and Jack W. Brehm. New York and London: 
Wiley. 1976. Pp. 348. £12.00. 


One dissonance which has not been resolved after two decades of the theory’s existence and after 
innumerable experiments which ‘prove’ that dissonances are resolved, is that experienced by many 
psychologists about dissonance theory itself. On the one hand everybody agrees that the theory since its 
inception has generated more experiments than any other social psychological idea in that period; virtually all 
agree that these studies demonstrate remarkable ingenuity in experimentation, an asset which only a few 
iconoclasts criticize either because it invariably involves deception of subjects or because they feel so much 
ingenuity deserves a better cause, or for both these reasons. Furthermore, there is without doubt a genuine 
psychological phenomenon at the root of the theory; and a number of outstanding psychologists have 
contributed to its proliferation. 

On the other hand many psychologists have felt arid still feel uneasy with that theory; it has been 
criticized on conceptual and methodological grounds from its inception onwards. Alternative 
conceptualizations for the underlying phenomenon exist, for example self-perception or defence 
mechanisms. The choice between such alternatives is no easy matter; it involves metapsychological 
assumptions and clarity about the domain for which the theories from which these concepts stem are 
appropriate. 

In search for such clarity this reviewer was seduced by the word Perspectives in the title of Wicklund & 
Brehm’s book in the expectation that it might, after all, reduce a long-standing dissonance by putting matters 
in their place, recognizing limitations, mistakes, methodological problems, as well as giving credit to other 
approaches. That this expectation would be disappointed the authors make explicit early in the text when 
they say that they have deliberately ignored metapsychological and epistemological issues (p. xiii). As a 
consequence, the title is misleading: ‘advocacy’ might have been more appropriate or, more charitably, 
‘development of a theory’. It is a book written for the converted; however, they deserve to receive further 
cognitive support in a more palatable style than, for example, in the following sentence: ‘We also find the 
quantitative impressiveness of this research to reflect investigators’ intrigue with the phenomenon of the 
human coming to justify having chosen an undesirable state of affairs’ (p. 77). 

Within these limitations the text proceeds systematically, giving overall results from many experiments, 
sometimes supported by quoting tables of results, though never in sufficient detail to permit a reader 
unfamiliar with the original studies to form a judgement on strengths or weaknesses in method. Beginning 
with a summary of Festinger’s original contribution the authors discuss among other issues commitment, 
choice and responsibility; they examine several alternative explanations of dissonance phenomena, all of 
which they find wanting. 

As one follows the development of the experimental work from one complex experiment to the next, two 
issues become conspicuous through the absence of discussion: cognitive dissonance started as an approach 
to the study of attitudes and attitude change; it appears to have left this focus behind; but its implicitly 
broader range of convenience remains undefined. Furthermore, the changes which dissonance theory has 
already undergone threaten to burst its original straightjacket of a tension-reduction model. While there is no 
doubt that some cognitive functionings fit such a model, there is also no doubt that others do not. If the 
theory is to encompass also the latter kind, is it not in need of a radical reformulation? The author’s own 
unease about such issues - though most of the time successfully disguised — breaks out into the open when 
they declare that post-decisional regret, interfering with dissonance predictions, ‘. .remains in part a 
mystery . . .it may be that regret is an extradissonance phenomenon’ (p. 318) and ‘.. .regret seems to be an 
everpresent possibility that can generally operate against the kind of rationalization we have come to 
associate with the theory ' (319). The manner in which dissonance research deals with such recalcitrant 
phenomena is made clear; ‘. . .investigators have discovered the necessity of ruling out “counter-selective” 
processes such as curiosity, intellectual honesty, and the desire to refute opposing arguments. . . special 
pains must be taken in formulating research so that clear dissonance-theory predictions can be made’ 

(p. 321). Translated into experimental practice this means, for example, that 1n a dissonance experiment 
designed to change attitudes to the police some subjects who were discovered actually to have such attitudes 
were excluded from the analysis (not reported in this book). 

Perhaps it is only fair to underline that some other theoretical approaches in social psychology are also 
confused, which has not prevented them from making some interesting contributions, as it has not f 
prevented dissonance theory from doing. One can agree with the authors when they cite When Prophecy 
fails as one such instance. But they also quote another application of dissonance theory with approval. 

This 1s a study of car buyers who, having ordered a specific car, had to wait some time for delivery. 
During this period some prospective buyers apparently always change their minds and revoke the 
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order. Armed with the insights from dissonance theory one firm was persuaded to have their salesmen make 
a reassuring phone call during the waiting period. Fewer of the people who received such calls cancelled 
than in a control group. Surprise! The authors have the good sense to comment on such ‘applications’ that 
‘there is no doubt that conceptual interpretations other than dissonance theory would be applicable. . .’ 

(p. 313). 

Altogether, then, this book will not reduce the dissonance many psychologists experience about 
dissonance theory. The case for and against it remains unaltered. To ask for more and more experiments, as 
the authors do (with the careful exclusion of curiosity or integrity), will not change the intellectual status of 
dissonance research. One wonders whether some of the gifted psychologists who have abandoned work in 
this area - Festinger, who started it all, or Bem, for example — have taken note of Hebb’s formidable 
statement: ‘What is not worth doing, is not worth doing well’. After many hundreds of experiments, what 
would be worth doing is to gain perspective on dissonance theory by thinking about what it all means. 
MARIE JAHODA 


The Achieving Society. By D. C. McClelland. Chichester: Wiley. 1976. Pp. xv+512. £12.75. 


This is a reprint of the first (1961) edition of The Achieving Society with a new introduction. In this 
McClelland discusses the standing of his theory of the major role played by achievement motivation in 
national rates of economic growth and he suggests that in the decade and a half that have elapsed since the 
book first appeared his hypothesis ‘has not been thrown into serious doubt’. 

It is scarcely possible to give a whole-hearted endorsement to this claim. The chief doubts about the thesis 
arise from its failure to predict national rates of economic growth in the contemporary period. National 
achievement motivation scores for 1950 were published in The Achieving Society so that 1t is now possible to 
check how far these national achievement motivation levels have predicted rates of economic growth. It has 
become increasingly apparent that the predictions have not been borne out. Several important nations have 
not fitted the theory at all satisfactorily. The most significant of these are Japan, for which McClelland gives 
a low achievement motivation rating, yet whose economic growth has been the most outstanding of any 
nation in the post-World War Two period. Conversely, the very high achievement motivation scores given to 
Argentina and India look increasingly odd as the shambling inefficiencies of these two economies lurch on 
from year to year. Indeed, in the early 1970s things had got so bad in India that the rate of economic growth 
actually turned negative. The overall picture is that there is clearly no correlation between McClelland's 1950 
national achievement motivation scores and subsequent rates of economic growth. Nor do McClelland's 
highly idiosyncratic indices of economic growth derived from electricity consumption fare any better. It is 
difficult to believe that McClelland can be unaware of the serious weakness of his theory in this respect and 
one would have hoped that he would have admitted the difficulty and made some effort to deal with it. 

On the other hand, the failure of one prediction is not too serious for a theory which clearly has a wide 
range of empirical support, much of which has come from independant research workers since the 
appearance of the first edition of the book. While the new edition will hardly be purchased for McClelland's 
new seven-page introduction, this reprinting provides a useful opportunity for libraries without The 
Achieving Society to buy what is unquestionably an important book. 

RICHARD LYNN 


Exploring Sex Differences. Edited by Barbara Lloyd and John Archer. London: Academic Press. 1976. Pp. 
280. £5.00. 


This book fills a gap in the literature between Maccoby & Jacklin's seminal work The Psychology of Sex 
Differences and the wealth of material still emerging which examines ‘women’s role’ from 
historical/economic/feminist viewpoints. Exploring Sex Differences contains ten essays; the first five examine 
sociological concepts, the later chapters look at biological influences. It would be a shame however, if as a 
result of fairly obvious divisions of subject matter and approach, undergraduates in particular focused on 
what may appear more relevant to psychological issues — i.e. influence of hormones, etc. — for there is much 
readable and provocative material in the early chapters. 

The 'ideological' approach of the book is 'interactionist' — a term covering all shades of opinion of course, 
but, given that, the analysis by the writers of their particular area is generally balanced and interesting, with 
both classic and new arguments given their due airing, e.g. Mayo's chapter on ‘Psychopathology’; Roger's 
chapter on ‘Male hormones’ and Messent's on ‘Female hormones’ are clear expositions of available data 
with the unavoidable emphasis on animal work. John Archer, in a chapter which should be compulsory 
reading for undergraduates, puts ‘biological’ and specifically hormonal theories of sex differences into 
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perspective. He does not avoid the issues involved — indeed he indicates that Andrew's work on testosterone 
has an interesting future — rather he points out that the applications of such theories as arguments in moral 
and social issues is quite unjustified in the face of the plasticity of human behaviour. To argue from a 
so-called ‘evolutionary’ viewpoint (conveniently forgetting the essential mutability of that process) is to 
apply to modern-day society concepts which are inappropriate If in primitive societies men went out to hunt 
and women reared the children that is no basis for saying things have to be similarly related under today's 
changed conditions. 

Rosenblatt & Cunningham in their examination of cross-cultural differences in the division of labour 
emphasize lack of mobility as a characteristic determining the nature of ‘feminine’ labour This is due to the 
necessity for mothers to breastfeed their children. Lack of mobility may also be found in Western society 
where the same reasons do not apply, but is due rather to lack of transport. The concept of 'power' is also 
discussed in their chapter from the position that since the two sexes occupy separate social worlds then 
arguments about balance of power are probably misplaced. 

Strathern's chapter on cultural stereotypes deals ably with the ‘power’ concept. Actually the most difficult 
chapter in the book, it contains potent criticisms of current debates which attempt to separate 'nature' from 
*culture ' in the analysis of sex differences. She observes that the notion that science can help to create an 
ideology of gender based on ‘fact’ as opposed to ‘myth’ is misplaced since science uses these very myths 
(gender constructs) to investigate sex differences. They can never be abandoned because they are 
fundamental to, not imposed upon, social processes. 

There are two chapters in the book about which I have reservations. Firstly, Kipnis on ‘Intelligence, 
occupational status and achievement'. The essay begins reasonably with a clear documentation of the 
ontogeny of sex differences in intelligence test scores, showing the close correlation between level of 
education and eventual IQ. She then notes that the former is the key to occupational status. Also Kipnis 
looks at the increasing number of gifted women who are 'successful' in terms of income and occupational 
status and finds a trend for them to marry later, have fewer children, or not to marry at all. Here then we 
have according to Kipnis the cause of a crisis for the technologically advanced countries, since the lowered 
breeding rate of these intelligent women is putting the gene pool at risk: ‘.. .those who inherit the earth will 
be the children of those who could get no better job' (p. 118). This is blamed on (female) 'self 
aggrandizement ' — the desire for individual fulfilment opposed to ‘. .loyalty to a societal. . . standard of 
well-being' (p. 118), on the 'derogation of the feminine sex role' (the refusal to be solely responsible for 
rearing children?) and even the decrease of religious beliefs which traditionally influence (certain) parental 
attitudes. Thus following Kipnis' arguments, the ills of the human race can in future be blamed on the 
selfishness of intelligent women (only) in refusing to be mere breeding machines! Kipnis' understanding of 
genetics is as misconceived as that of the early 20th century eugenicists who wished to sterilize mental 
defectives. I leave it for the sociologists to monitor the gradual decline of the human race (or otherwise). 

The other chapter about which I have doubts is that by Diane McGuinness 'Sex differences in the 
organization of perception and cognition'. Whilst openly stating at the beginning of the essay her intention 
to be selective, and acknowledging the dangers of doing this, she does not appear to realize just how 
dangerous. 

The fact remains that in certain areas, notably the visual-verbal and object-social dichotomies of male- 
female characteristics, McGuinness has succeeded in arriving at completely the opposite conclusion from 
Maccoby & Jacklin who included ‘no-differences’ and contrary results in their analyses and totted up the 
weights of studies. For example, McGuinness cites Watson's (1969) experiment on operant conditioning of 
visual fixation under conditions of auditory and visual reinforcement. She quotes from the results of the first 
experiment in which females were found to respond under auditory reinforcement, males under visual — but 
totally ignores the second experiment, controlling conditions more carefully, where no sex differences were 
found. Indeed, Maccoby & Jacklin conclude that the sexes are highly similar in their responsiveness to 
visual stimuli on a variety of measures (The Psychology of Sex Differences p 351). 

Again, with the *object-social' distinction her claim that girls are more responsive to social stimuli is in 
direct opposition to the conclusion arrived at by Maccoby & Jacklin based on a far wider and more critical 
assessment of evidence. Unfortunately McGuinness' own work in which much of her faith is placed, is still 
in press and unavailable for critical examination. One experiment in particular is described, on which her 
object-social argument seems to rest, which does raise doubts in my mind. McGuinness & Symonds 
presented a task in which photographs are presented stereoscopically to subjects, one of an ‘Object’ and of 
a person. The subject experiences binocular rivalry and must ‘choose’ in order to report what he/she sees. 
Apparently the *most meaningful stimulus predominates', and males more often report seeing objects than 
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people and females the reverse. I sincerely hope that the experiment was carefully guarded against the effects 
of demand characteristics which can so easily occur when self-reports in an ambiguous situation are the data 
(e.g. see Sheehan & Neisser ‘Variables affecting the vividness of imagery in recall ‘reported in Sheehan’s 
book The Function and Nature of Imagery, p. 240). 

McGuinness’ essay contains some perceptive conclusions but I feel her reasoning is not always as sound as 
might be hoped. She comments lucidly on Witkin’s work on field independence, but does not appear to 
recognize the uselessness of such blanket terms as ‘analytic’ and ‘synthetic’ with reference to something 
which might be called cognitive style. She also accepts uncritically the notions of curiosity and restructuring 
in problem solving. 

Typographically I only have one small complaint which is the lack of lines round the tables which seems to 
make the data harder to assimilate — 1 got temporarily lost when turning the page to Table 1 in Ullian's 
chapter on p. 34, There is also half a sentence missing on p. 247 which fortunately is not crucial to 
comprehension. 

The book in general is a must for the library and a worthwhile addition to a researcher’s shelf, for i its high 
standard of exposition summarizing a range of important scientific approaches (and pitfalls) in a complex and 
highly charged area. 


S. J. CLEAR 
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publication to very short articles, not exceeding two 
printed pages in length (about 1000 words). As 
a rule such articles need not have a summary or 
titled headings. 


2. The circulation of the Journal is world-wide. 
There is no restriction to British authors; papers 

are invited and encouraged from authors throughout 
the world. 


3. Papers should be as short as is consistent with 
clear presentation of the subject matter; in general 


g they should not exceed about 7000 words. A 


summary of up to 200 words should be provided. 

The title should indicate exactly but as briefiy as 

possible the subject of the article. Papers will be 

evaluated by the Editor and referees in terms of 

their theoretical interest, practical interest, relevance 
_ to the Journal, and readability. 


D 


44. Publication is speeded by care in preparation. 


` (a) Authors are asked to submit a separate front 


page reporting the title of the article and their 
name(s) and affiliations. The author(s) name(s) 
should not then appear as such within the text. 
(b) Contributions should be typed in double spacing 
with wide margins and only on one side of each 
sheet. Sheets should be numbered. The top copy 
and at least one carbon copy should be sub- 
mitted and a copy should be retained by the 
author. 
(c) Tables should be typed in double spacing on 
separate sheets. Each should have a self- 
explanatory title and should be comprehensible 
without reference to the text. They should be 
referred to in the text by arabic numerals. Data 
given should be checked for accuracy and must 
agree with mentions in the text. 
Figures, i.e. diagrams, graphs or other 
illustrations, should be on separate sheets, 
numbered sequentially ‘Fig. 1’, etc., and each 
identified on the back with the author’s name 
and the title of the paper. They should be 
carefully drawn, larger than their intended size, 
suitable for photographic reproduction and clear 
when reduced in size. Special care is needed with 
symbols: correction at proof stage may not be 
possible. Lettering must not be put on the 
original drawing but upon a copy to guide the 
printer. Captions should be listed on a separate 
(e) sheet. 
Bibliographical references in the text should 
quote the author’s name and the date of 
publication thus: Bartlett (1953). They should be 
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listed alphabetically by author at the end of the ` 


article according to the following format: 


Berson, W. A. (1975). Juvenile Theft: The Causal 
Factors. London: Harper & Row. 

Fraser, C. O. (1976). Cognitive strategies and 
ra ae scaling. Br. J. Psychol. 67, 


Particular care should be taken to ensure that 
references are accurate and complete. Where 
books are available in both hardback and paper- 
back please give references to both editions and 
publishers. For journal and other abbreviations 
use only those given in the B.P.S. booklet 
Suggestions to Authors, otherwise give all names 
in full. 

(f) SI Units must be used for all measurements, 
rounded off to practical values if appropriate, 
with the Imperial equivalent in parentheses. 

A guide to SI Units is given in the B.P.S. 
booklet Suggestions to Authors. 

(g) Supplementary data too extensive for publication 
may be deposited with the British Library 
Lending Division. Such material includes 
numerical data, computer programs, fuller 
details of case studies and experimental 
techniques. The material should be submitted to 
the editors together with the article, for 
simultaneous refereeing. Further details of the 
scheme are given in Bull. Br. psychol. Soc. (1977), 
30, February. Copies of Supplementary 
Publications may be obtained, at a cost of 5p a 
page (including postage), from British Library 
Lending Division, Boston Spa, Wetherby, 
Yorkshire, LS23 7BQ. 


5. Proofs are sent to authors for correction of print, 
but not for introduction of new or different material. 
They should be returned to the Press Editor, 
together with the typescript, as soon as possible. 
Fifty complimentary copies of each paper are 
supplied on request; further copies may be orderec. 
when proofs are returned. 


6. Submission of a paper implies that it has not 
been published elsewhere. The author is responsible 
for getting written permission to publish lengthy 
quotations, illustrations, etc., of which he does not 
own the copyright. 


7. The tendency is growing for articles to be 
reproduced abroad without permission. To protect 
the interests of authors and journals the B.P.S. 
requires copyright to be assigned to the Society (by 
signing a form), on the express condition that 
authors may use their own material elsewhere at any 
time without permission. The author's consent, and 
approval for a suggested fee, will be sought before 
applications to reproduce material are granted, 
Further details are given in the B.P.S. booklet 
Suggestions to Authors. 
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